Discussion:
[std-proposals] More conservative version of P0907 "Signed Integers are Two's Complement"
Arthur O'Dwyer
2018-02-23 22:05:16 UTC
Permalink
The pre-JAX mailing contains this discussion-provoking paper by JF Bastien:
P0907R0 "Signed Integers are Two's Complement"
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r0.html>

My understanding is that JF wants to use this "modest proposal" (for
wrapping arithmetic on `int`, among other things) as a way to incite
discussion among the Committee.

I have created an as-yet-unofficial "conservative fork" of the proposal,
which removes the parts that I think are airballs, while leaving in much of
what I consider the good stuff — notably, making signed-to-unsigned and
unsigned-to-signed conversions well-defined in terms of two's complement
representations, and defining what happens when you bit-shift into or out
of the sign bit.
https://quuxplusone.github.io/draft/twosc-conservative.html

I hope that if the Committee asks JF to come back with a more conservative
proposal, the existence of my "conservative fork" will save time, possibly
even allow further discussion later in the week at JAX.

I personally will not be at JAX, though. JF, will you be? Could I count on
you to... not to "champion" my unsubmitted paper, of course, but just to be
aware of it in case something like it is asked for by the Committee? I
mean, the worst-case, which I would like to avoid, is that JF's paper is
rejected as too crazy and then the entire subject is tabled until
Rapperswil. I would like to see some concrete progress in this department
at JAX if humanly possible.

–Arthur

P.S. — Also, if anyone on std-proposals has objections to the specific
diffs in my conservative proposal, I would like to know about it. I
deliberately tried to remove any whiff of controversy from the diff. (This
is distinct from objecting to my presumptuousness or objecting to wasting
the Committee's time. ;))
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0LuUyyHCDhYZ%2BLn2iNuj9WH1bfYrfNoRgD06C5wdqW35w%40mail.gmail.com.
JF Bastien
2018-02-23 22:11:51 UTC
Permalink
Hi Arthur,

I’ll be in JAX and will definitely champion my own approach, but will make sure any other approaches are discussed thoroughly. I’ll then ask for direction polls from EWG, and will follow the proposed direction in an updated paper.

I’d therefore appreciate it if you instead had rationales for why you’d take a different approach. I offered my rationale for going the direction I suggest, and the right thing to change people’s minds is to offer compelling arguments to go in another direction. Ideally you’d have data to back up any performance / bug finding / odd architecture claims. Do you think you have time to get this together before JAX? That’s definitely something I’d present along my paper, though I might disagree.

In the end I simply want two’s complement, and I see a few ways that this play out with everyone liking the outcome. I just ask that opposition comes with rationales, not “I like my way better”. :-)

JF
P0907R0 "Signed Integers are Two's Complement" <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r0.html>
My understanding is that JF wants to use this "modest proposal" (for wrapping arithmetic on `int`, among other things) as a way to incite discussion among the Committee.
I have created an as-yet-unofficial "conservative fork" of the proposal, which removes the parts that I think are airballs, while leaving in much of what I consider the good stuff — notably, making signed-to-unsigned and unsigned-to-signed conversions well-defined in terms of two's complement representations, and defining what happens when you bit-shift into or out of the sign bit.
https://quuxplusone.github.io/draft/twosc-conservative.html <https://quuxplusone.github.io/draft/twosc-conservative.html>
I hope that if the Committee asks JF to come back with a more conservative proposal, the existence of my "conservative fork" will save time, possibly even allow further discussion later in the week at JAX.
I personally will not be at JAX, though. JF, will you be? Could I count on you to... not to "champion" my unsubmitted paper, of course, but just to be aware of it in case something like it is asked for by the Committee? I mean, the worst-case, which I would like to avoid, is that JF's paper is rejected as too crazy and then the entire subject is tabled until Rapperswil. I would like to see some concrete progress in this department at JAX if humanly possible.
–Arthur
P.S. — Also, if anyone on std-proposals has objections to the specific diffs in my conservative proposal, I would like to know about it. I deliberately tried to remove any whiff of controversy from the diff. (This is distinct from objecting to my presumptuousness or objecting to wasting the Committee's time. ;))
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/53F0ECAD-6569-42FA-9675-1FB73DFAC6EA%40apple.com.
Arthur O'Dwyer
2018-02-23 22:47:20 UTC
Permalink
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will make
sure any other approaches are discussed thoroughly. I’ll then ask for
direction polls from EWG, and will follow the proposed direction in an
updated paper.
And if you manage to write the updated paper overnight and have it back on
the table the next day at JAX, then my paper *will* be utterly superfluous.
:)

I am, however, worried that the writing of a new paper might slip more than
a day, which would end up with you coming back in the pre-Rapperswil
mailing with another two's-complement paper after the perception that your
first two's-complement paper was "rejected" in Jacksonville, which would
set a perceived negative precedent in people's minds.
Post by JF Bastien
I’d therefore appreciate it if you instead had rationales for why you’d
take a different approach. I offered my rationale for going the direction I
suggest, and the right thing to change people’s minds is to offer
compelling arguments to go in another direction. Ideally you’d have data to
back up any performance / bug finding / odd architecture claims. Do you
think you have time to get this together before JAX? That’s definitely
something I’d present along my paper, though I might disagree.
I agree that rationales are good things, and I don't want to go too "tu
quoque" here, but... *does* your paper include any rationale? The closest
thing I see is in the "Introduction", where my version strikes out these
two bullets from your version:

-

Associativity and commutativity of integers is needlessly obtuse.
-

Naïve overflow checks, which are often security-critical, often get
eliminated by compilers. This leads to exploitable code when the intent was
clearly not to and the code, while naïve, was correctly performing security
checks for two’s complement integers. Correct overflow checks are difficult
to write and equally difficult to read, exponentially so in generic code.

These are true, but then the current undefined behavior on signed overflow
has some unmentioned good effects, too:

- Unintentional unsigned wraparound (for example, in the argument to
`malloc`) has been a known source of bugs for a long time. See for example
[Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>, whose
final sentence is, "Our results also imply that tools for detecting integer
numerical errors need to *distinguish intentional from unintentional uses
of wraparound operations* — a challenging task — in order to minimize false
alarms. [emphasis added]" The current undefinedness of signed overflow
permits implementations, such as UBSan, to detect all signed wraparound
behavior as unintentional by definition, and diagnose it accordingly.

- The impossibility of signed wraparound allows optimization of tight inner
loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a`
the loop would eventually overflow and invoke undefined behavior. This is
intuitively the same behavior that we have with C++ iterators: the compiler
is allowed to assume that the existence of a loop over the range `a` to `b`
implies that `b` is actually reachable from `a` according to
forward-increment semantics, even though in practice many implementations'
std::list::iterator internally performs the equivalent of "wrap on
overflow." (See the graphical diagram of "sentinel-node containers" in P0773
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B> if
needed.)
John McFarlane did a lightning talk about integer UB and codegen within the
past year but I don't know if the slides are somewhere. I can ask him.

I mean, it's not like there's any shortage of educational material on UB in
C and C++ and its *good and* bad effects.
What there *is* a shortage of IMHO is material on *ones'-complement* in C
and C++. That's why I kept large swaths of your paper intact in my fork. :)


In the end I simply want two’s complement, and I see a few ways that this
Post by JF Bastien
play out with everyone liking the outcome. I just ask that opposition comes
with rationales, not “I like my way better”. :-)
Yes; just remember that the rationale for making *no change at all* is
invariably really strong.
And getting the camel's nose into the tent in 2018 does not preclude
getting the rest of the camel at some point! I just want to make as sure
as possible that we can get the nose in. Because even the nose will be
useful. (The "nose" includes arithmetic right-shift, for example. And
well-defined casts from int16_t to int8_t. These are very useful features
to get into C++ even if the rest of the camel remains outside forever!)

–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0%2Bkq5hFWG6UqZyLFOEsPNunLM4zVrn0ym6s05Cf6QGJxw%40mail.gmail.com.
JF Bastien
2018-02-23 23:07:18 UTC
Permalink
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will make sure any other approaches are discussed thoroughly. I’ll then ask for direction polls from EWG, and will follow the proposed direction in an updated paper.
And if you manage to write the updated paper overnight and have it back on the table the next day at JAX, then my paper will be utterly superfluous. :)
I am, however, worried that the writing of a new paper might slip more than a day, which would end up with you coming back in the pre-Rapperswil mailing with another two's-complement paper after the perception that your first two's-complement paper was "rejected" in Jacksonville, which would set a perceived negative precedent in people's minds.
I’ve frequently presented updated papers after obtaining feedback: evenings are for paper writing.
Post by JF Bastien
I’d therefore appreciate it if you instead had rationales for why you’d take a different approach. I offered my rationale for going the direction I suggest, and the right thing to change people’s minds is to offer compelling arguments to go in another direction. Ideally you’d have data to back up any performance / bug finding / odd architecture claims. Do you think you have time to get this together before JAX? That’s definitely something I’d present along my paper, though I might disagree.
Associativity and commutativity of integers is needlessly obtuse.
Naïve overflow checks, which are often security-critical, often get eliminated by compilers. This leads to exploitable code when the intent was clearly not to and the code, while naïve, was correctly performing security checks for two’s complement integers. Correct overflow checks are difficult to write and equally difficult to read, exponentially so in generic code.
I can indeed improve the rationale. The security aspect is a huge upside for me, having fixed so much “wrong” code of that form. One I should add is one of the two pillars of C++ from p0939r0 "A direct map to hardware (initially from C)”. It doesn’t come out of the paper as much as I want, even though it’s what initially motivated me writing it.
Post by JF Bastien
- Unintentional unsigned wraparound (for example, in the argument to `malloc`) has been a known source of bugs for a long time. See for example [Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>, whose final sentence is, "Our results also imply that tools for detecting integer numerical errors need to distinguish intentional from unintentional uses of wraparound operations — a challenging task — in order to minimize false alarms. [emphasis added]" The current undefinedness of signed overflow permits implementations, such as UBSan, to detect all signed wraparound behavior as unintentional by definition, and diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB because of the 3 representation, not to catch bugs, otherwise unsigned overflow would also have been UB.

FWIW UBSan supports unsigned overflow detection:
-fsanitize=unsigned-integer-overflow: Unsigned integer overflows. Note that unlike signed integer overflow, unsigned integer is not undefined behavior. However, while it has well-defined semantics, it is often unintentional, so UBSan offers to catch it.
Post by JF Bastien
- The impossibility of signed wraparound allows optimization of tight inner loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a` the loop would eventually overflow and invoke undefined behavior.
I claim that this is also an emergent feature, not by design, caused by the 3 signed integer representations. I also claim that much of this performance can be regained with a better optimizer (the compiler I work on certainly optimizes loops substantially without assuming UB on overflow). Further, the Internet shows that this optimization isn’t something developers knowingly opt into, and when they hit it they are surprised by the bugs it generates.
Post by JF Bastien
This is intuitively the same behavior that we have with C++ iterators: the compiler is allowed to assume that the existence of a loop over the range `a` to `b` implies that `b` is actually reachable from `a` according to forward-increment semantics, even though in practice many implementations' std::list::iterator internally performs the equivalent of "wrap on overflow." (See the graphical diagram of "sentinel-node containers" in P0773 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B> if needed.)
John McFarlane did a lightning talk about integer UB and codegen within the past year but I don't know if the slides are somewhere. I can ask him.
The intuition is reversed, though: C came before C++ and iterators. Iterators are closer to pointers than integers, and I’m not removing pointer overflow UB. Put another way: pointers are supposed to be disjoint sets of objects—even in a Harvard-style architecture—and traversing from one to another is as nonsensical as overflowing the pointer, whereas numbers are in a 2-dimensional line. Whether you’re a flat-line believer or you believe the line is circular is really what we’re discussing. Where you fall in that belief system decides whether going off the end of the number line takes to to Undefined Land, or back to the other side of the line. For floating-point I definitely believe in the flat-line hypothesis, where going off the end takes us to infinite-land. For signed integers I believe the number line is circular, just as it is for unsigned integers, no traps, no saturation, no UB.
Post by JF Bastien
I mean, it's not like there's any shortage of educational material on UB in C and C++ and its good and bad effects.
Sure, do you believe there are particular references that should be read with my proposal?
Post by JF Bastien
What there is a shortage of IMHO is material on ones'-complement in C and C++. That's why I kept large swaths of your paper intact in my fork. :)
Might the lack of such documentation be caused by a lack of ones’ complement hardware using modern C++?
Post by JF Bastien
In the end I simply want two’s complement, and I see a few ways that this play out with everyone liking the outcome. I just ask that opposition comes with rationales, not “I like my way better”. :-)
Yes; just remember that the rationale for making no change at all is invariably really strong.
And getting the camel's nose into the tent in 2018 does not preclude getting the rest of the camel at some point! I just want to make as sure as possible that we can get the nose in. Because even the nose will be useful. (The "nose" includes arithmetic right-shift, for example. And well-defined casts from int16_t to int8_t. These are very useful features to get into C++ even if the rest of the camel remains outside forever!)
–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8467483E-0F95-43B3-B877-FD790A3597A1%40apple.com.
Arthur O'Dwyer
2018-02-24 00:34:02 UTC
Permalink
Post by Arthur O'Dwyer
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will
make sure any other approaches are discussed thoroughly. I’ll then ask for
direction polls from EWG, and will follow the proposed direction in an
updated paper.
And if you manage to write the updated paper overnight and have it back on
the table the next day at JAX, then my paper *will* be utterly
superfluous. :)
I am, however, worried that the writing of a new paper might slip more
than a day, which would end up with you coming back in the pre-Rapperswil
mailing with another two's-complement paper after the perception that your
first two's-complement paper was "rejected" in Jacksonville, which would
set a perceived negative precedent in people's minds.
evenings are for paper writing.
And if you manage to write the updated paper overnight and have it back on
the table the next day at JAX, then my paper will be utterly superfluous. :)


These are true, but then the current undefined behavior on signed overflow
Post by Arthur O'Dwyer
- Unintentional unsigned wraparound (for example, in the argument to
`malloc`) has been a known source of bugs for a long time. See for example
[Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>,
whose final sentence is, "Our results also imply that tools for detecting
integer numerical errors need to *distinguish intentional from
unintentional uses of wraparound operations* — a challenging task — in
order to minimize false alarms. [emphasis added]" The current
undefinedness of signed overflow permits implementations, such as UBSan, to
detect all signed wraparound behavior as unintentional by definition, and
diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB
because of the 3 representation, not to catch bugs, otherwise unsigned
overflow would also have been UB. [...] I claim that this is also an
emergent feature, not by design, caused by the 3 signed integer
representations.
I think you are right in the historical sense. These days the emergent
"catch bugs" rationale has survived, though, even as the exotic hardwares
have died out.
Post by Arthur O'Dwyer
-fsanitize=unsigned-integer-overflow: Unsigned integer overflows. Note
that unlike signed integer overflow, unsigned integer is not undefined
behavior. However, while it has well-defined semantics, it is often
unintentional, so UBSan offers to catch it.
This is very cool; I was unaware of this.
Your paper would benefit from mentioning this. But the obvious comeback is:
where's the numbers on how many false positives UBSan generates in this
mode? That number cannot possibly be zero.
Post by Arthur O'Dwyer
- The impossibility of signed wraparound allows optimization of tight inner loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a`
the loop would eventually overflow and invoke undefined behavior.
I claim that this is also an emergent feature, not by design, caused by
the 3 signed integer representations. I also claim that much of this
performance can be regained with a better optimizer (the compiler I work on
certainly optimizes loops substantially without assuming UB on overflow).
Further, the Internet shows that this optimization isn’t something
developers knowingly opt into, and when they hit it they are surprised by
the bugs it generates.
Users never opt into bugs by definition. But users notice performance
regressions (in the *compiler*) almost as quickly as they notice
correctness regressions (in *their own code*).

This is intuitively the same behavior that we have with C++ iterators: the
Post by Arthur O'Dwyer
compiler is allowed to assume that the existence of a loop over the range
`a` to `b` implies that `b` is actually reachable from `a` according to
forward-increment semantics, even though in practice many implementations'
std::list::iterator internally performs the equivalent of "wrap on
overflow." (See the graphical diagram of "sentinel-node containers" in P0773
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B>
if needed.)
John McFarlane did a lightning talk about integer UB and codegen within
the past year but I don't know if the slides are somewhere. I can ask him.
The intuition is reversed, though: C came before C++ and iterators.
Again, historically accurate but this is not modern C++'s problem. We don't
teach C before C++. (And if we did, we might soon have to explain that
integer overflow is undefined in C but not in C++? Add "C source-level
compatibility" to the list of rationales for preserving C's undefined
overflow behavior in C++.)


I mean, it's not like there's any shortage of educational material on UB in
Post by Arthur O'Dwyer
C and C++ and its *good and* bad effects.
Sure, do you believe there are particular references that should be read with my proposal?
Off the top of my head, I recall John McFarlane's lightning talk on
codegen, Michael Spencer's "My Little Optimizer: Undefined Behavior Is
Magic" talk, and pretty much anything involving John Regehr. I found the
link to Regehr2012 as one of the top Google hits for "unintentional
unsigned overflow".
Post by Arthur O'Dwyer
What there *is* a shortage of IMHO is material on *ones'-complement* in C
and C++. That's why I kept large swaths of your paper intact in my fork. :)
Might the lack of such documentation be caused by a lack of ones’
complement hardware using modern C++?
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude
with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with
code-generators because this *does* happen in practice.

Removing ones'-complement from C++ will be as painless (or painful) as
removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as
removing type-based alias analysis would be.

–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0L%2BBDk3TMYX6znEe0RMbEPZfYGzEWDLjOXLB60EyR5R6g%40mail.gmail.com.
John McFarlane
2018-02-24 02:37:48 UTC
Permalink
Post by Arthur O'Dwyer
I mean, it's not like there's any shortage of educational material on UB
in C and C++ and its *good and* bad effects.
Sure, do you believe there are particular references that should be read
with my proposal?
Off the top of my head, I recall John McFarlane's lightning talk on
codegen, Michael Spencer's "My Little Optimizer: Undefined Behavior Is
Magic" talk, and pretty much anything involving John Regehr. I found the
link to Regehr2012 as one of the top Google hits for "unintentional
unsigned overflow".
The lightning talk material can be found here:
https://github.com/johnmcfarlane/presentations/tree/master/2017-09-27_CppCon2017
In particular, Krister Walfridsson's blog post regarding GCC optimization
of signed integers gives some great examples which certainly hadn't
occurred to me:
https://kristerw.blogspot.com/2016/02/how-undefined-signed-overflow-enables.html

John
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CABPJVnQ1cZjQaAXcvWiQW4_63V2-wbXFrMYUMRb5R%3D-sPtjM2A%40mail.gmail.com.
Richard Smith
2018-02-24 06:51:12 UTC
Permalink
Post by Arthur O'Dwyer
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will
make sure any other approaches are discussed thoroughly. I’ll then ask for
direction polls from EWG, and will follow the proposed direction in an
updated paper.
And if you manage to write the updated paper overnight and have it back on
the table the next day at JAX, then my paper *will* be utterly
superfluous. :)
I am, however, worried that the writing of a new paper might slip more
than a day, which would end up with you coming back in the pre-Rapperswil
mailing with another two's-complement paper after the perception that your
first two's-complement paper was "rejected" in Jacksonville, which would
set a perceived negative precedent in people's minds.
evenings are for paper writing.
And if you manage to write the updated paper overnight and have it back on
the table the next day at JAX, then my paper will be utterly superfluous. :)


These are true, but then the current undefined behavior on signed overflow
Post by Arthur O'Dwyer
- Unintentional unsigned wraparound (for example, in the argument to
`malloc`) has been a known source of bugs for a long time. See for example
[Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>,
whose final sentence is, "Our results also imply that tools for detecting
integer numerical errors need to *distinguish intentional from
unintentional uses of wraparound operations* — a challenging task — in
order to minimize false alarms. [emphasis added]" The current
undefinedness of signed overflow permits implementations, such as UBSan, to
detect all signed wraparound behavior as unintentional by definition, and
diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB
because of the 3 representation, not to catch bugs, otherwise unsigned
overflow would also have been UB. [...] I claim that this is also an
emergent feature, not by design, caused by the 3 signed integer
representations.
I think you are right in the historical sense. These days the emergent
"catch bugs" rationale has survived, though, even as the exotic hardwares
have died out.
Post by Arthur O'Dwyer
-fsanitize=unsigned-integer-overflow: Unsigned integer overflows. Note
that unlike signed integer overflow, unsigned integer is not undefined
behavior. However, while it has well-defined semantics, it is often
unintentional, so UBSan offers to catch it.
This is very cool; I was unaware of this.
Your paper would benefit from mentioning this. But the obvious comeback is:
where's the numbers on how many false positives UBSan generates in this
mode? That number cannot possibly be zero.


The false positive rate is empirically huge, and extremely painful because
there is no easy syntactic way to distinguish between true and false
positives.

- The impossibility of signed wraparound allows optimization of tight inner
Post by Arthur O'Dwyer
loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a`
the loop would eventually overflow and invoke undefined behavior.
I claim that this is also an emergent feature, not by design, caused by
the 3 signed integer representations. I also claim that much of this
performance can be regained with a better optimizer (the compiler I work on
certainly optimizes loops substantially without assuming UB on overflow).
Further, the Internet shows that this optimization isn’t something
developers knowingly opt into, and when they hit it they are surprised by
the bugs it generates.
Users never opt into bugs by definition. But users notice performance
regressions (in the *compiler*) almost as quickly as they notice
correctness regressions (in *their own code*).

This is intuitively the same behavior that we have with C++ iterators: the
Post by Arthur O'Dwyer
compiler is allowed to assume that the existence of a loop over the range
`a` to `b` implies that `b` is actually reachable from `a` according to
forward-increment semantics, even though in practice many implementations'
std::list::iterator internally performs the equivalent of "wrap on
overflow." (See the graphical diagram of "sentinel-node containers" in P0773
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B>
if needed.)
John McFarlane did a lightning talk about integer UB and codegen within
the past year but I don't know if the slides are somewhere. I can ask him.
The intuition is reversed, though: C came before C++ and iterators.
Again, historically accurate but this is not modern C++'s problem. We don't
teach C before C++. (And if we did, we might soon have to explain that
integer overflow is undefined in C but not in C++? Add "C source-level
compatibility" to the list of rationales for preserving C's undefined
overflow behavior in C++.)


I mean, it's not like there's any shortage of educational material on UB in
Post by Arthur O'Dwyer
C and C++ and its *good and* bad effects.
Sure, do you believe there are particular references that should be read with my proposal?
Off the top of my head, I recall John McFarlane's lightning talk on
codegen, Michael Spencer's "My Little Optimizer: Undefined Behavior Is
Magic" talk, and pretty much anything involving John Regehr. I found the
link to Regehr2012 as one of the top Google hits for "unintentional
unsigned overflow".
Post by Arthur O'Dwyer
What there *is* a shortage of IMHO is material on *ones'-complement* in C
and C++. That's why I kept large swaths of your paper intact in my fork. :)
Might the lack of such documentation be caused by a lack of ones’
complement hardware using modern C++?
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude
with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with
code-generators because this *does* happen in practice.

Removing ones'-complement from C++ will be as painless (or painful) as
removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as
removing type-based alias analysis would be.

–Arthur
--
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/
isocpp.org/d/msgid/std-proposals/CADvuK0L%2BBDk3TMYX6znEe0RMbEPZfYGzEWDL
jOXLB60EyR5R6g%40mail.gmail.com
<https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0L%2BBDk3TMYX6znEe0RMbEPZfYGzEWDLjOXLB60EyR5R6g%40mail.gmail.com?utm_medium=email&utm_source=footer>
.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOfiQqmKqYn_r-9%3DaPpLsBrwnNO_VFh-tz_6CVMpGRhuOvAbsA%40mail.gmail.com.
Magnus Fromreide
2018-02-24 10:05:38 UTC
Permalink
Post by Arthur O'Dwyer
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude
with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with
code-generators because this *does* happen in practice.
Removing ones'-complement from C++ will be as painless (or painful) as
removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as
removing type-based alias analysis would be.
One thing I have been mulling over is if it would make sense to create
context-dependent keywords that allows the user to choose the behaviour
they want.

nonoverflowing unsigned int foo; // UB if overflow
wrapping int bar; // Wraps

I do not see these modifiers as beeing part of the types.

Under this scheme "unsigned int" would be equivalent to "wrapping unsigned
int" and "int" would be equivalent to "nonoverflowing int".

Obviously this bikeshed would need to be painted but that is for later.

Changing the equivalence of int might be reasonable but that is an unrelated
change just like killing one-complement.

/MF
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/20180224100538.GA2770%40noemi.
Jonathan Müller
2018-02-24 16:08:36 UTC
Permalink
Post by Magnus Fromreide
Post by Arthur O'Dwyer
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude
with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with
code-generators because this *does* happen in practice.
Removing ones'-complement from C++ will be as painless (or painful) as
removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as
removing type-based alias analysis would be.
One thing I have been mulling over is if it would make sense to create
context-dependent keywords that allows the user to choose the behaviour
they want.
nonoverflowing unsigned int foo; // UB if overflow
wrapping int bar; // Wraps
I do not see these modifiers as beeing part of the types.
Under this scheme "unsigned int" would be equivalent to "wrapping unsigned
int" and "int" would be equivalent to "nonoverflowing int".
Obviously this bikeshed would need to be painted but that is for later.
Changing the equivalence of int might be reasonable but that is an unrelated
change just like killing one-complement.
/MF
Why make it keywords though?

It is easy to have a `std::nonoverflowing_unsigned` and
`std::wrapping_int` if we chose to go that route.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7336dd9a-50af-b7d9-7837-f304cb643c24%40gmail.com.
Nicol Bolas
2018-02-24 16:47:06 UTC
Permalink
<javascript:>>
Post by Magnus Fromreide
Post by Arthur O'Dwyer
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or
sign-magnitude
Post by Magnus Fromreide
Post by Arthur O'Dwyer
with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow
with
Post by Magnus Fromreide
Post by Arthur O'Dwyer
code-generators because this *does* happen in practice.
Removing ones'-complement from C++ will be as painless (or painful) as
removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as
removing type-based alias analysis would be.
One thing I have been mulling over is if it would make sense to create
context-dependent keywords that allows the user to choose the behaviour
they want.
nonoverflowing unsigned int foo; // UB if overflow
wrapping int bar; // Wraps
I do not see these modifiers as beeing part of the types.
Under this scheme "unsigned int" would be equivalent to "wrapping
unsigned
Post by Magnus Fromreide
int" and "int" would be equivalent to "nonoverflowing int".
Obviously this bikeshed would need to be painted but that is for later.
Changing the equivalence of int might be reasonable but that is an
unrelated
Post by Magnus Fromreide
change just like killing one-complement.
/MF
Why make it keywords though?
Because otherwise they'd have to be *types*. And the whole point of his
suggestion was that they're not different types. There is no such thing as
a `wrapped int` type; there's just an `int`, which the compiler will treat
as "wrapped".

Now granted, I don't think this un-typed will work. If you do this:

wrapped int a = ...;
wrapped int b = ...;
int c = ...;

auto x = a + b;
auto y = a + c;

Is `x` or `y` declared `wrapped`? If `x` is wrapped and `y` is not, why?
How many `wrapped` integers does it take to make an expression "wrapped"?
And if neither of them are wrapped, how do you do auto-deduction with
wrapping?

With a type-based approach, these questions answer themselves.
`wrapped_int` would be a type. Adding a `wrapped_int` to a `wrapped_int`
would produce another `wrapped_int`, which `auto` would naturally deduce.
There would be rules about adding `wrapped_` types to their unwrapped
equivalents, which would make the deduction of `y` obvious. You're not
adding a phantasmal property to expressions; you're just using the type
system.

I think the main advantage of the untyped approach is that it neatly ducks
conversion issues. That is, is the conversion from `wrapped_int` to `int`
like a user-defined conversion, or is it like converting a `short` to an
`int`? How does it behave with overload resolution and scoring? And so
forth.

The thing is, we don't really *want* to treat a wrapped integer as a
different type from `int`. What we *really* want is to treat an *expression*
as wrapped or unwrapped. It's just that using a type-based approach to this
makes it much more convenient (from a language perspective) to do this,
since tracking the type of expressions is something we already do.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/17d9f9ec-4d63-416c-a476-3955fd9a702e%40isocpp.org.
Robert Ramey
2018-02-24 17:28:57 UTC
Permalink
Post by Nicol Bolas
With a type-based approach, these questions answer themselves.
`wrapped_int` would be a type. Adding a `wrapped_int` to a `wrapped_int`
would produce another `wrapped_int`, which `auto` would naturally
deduce. There would be rules about adding `wrapped_` types to their
unwrapped equivalents, which would make the deduction of `y` obvious.
You're not adding a phantasmal property to expressions; you're just
using the type system.
To get an idea where this eventually leads to, consider the safe
numerics library. I've been working on this for some time, it's been
accepted by Boost by is still pending integration into Boost. More
information can be found at the boost library incubator.

Robert Ramey
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/p6s78r%244cm%241%40blaine.gmane.org.
Jonathan Müller
2018-02-24 18:32:00 UTC
Permalink
Post by Jonathan Müller
Why make it keywords though?
Because otherwise they'd have to be /types/. And the whole point of his
suggestion was that they're not different types. There is no such thing
as a `wrapped int` type; there's just an `int`, which the compiler will
treat as "wrapped".
But what's the downside of making them types?

I've heard of languages where you have `+` for addition with UB overflow
and something like `+%` for addition with modulo. But I don't think it
makes that much sense: either you want addition with UB for all addition
operations on an object, or modulo for all. The only exception - I can
think of - would be in the implementation of an `would_overflow(a, b)`,
but there you can cast.
Post by Jonathan Müller
I think the main advantage of the untyped approach is that it neatly
ducks conversion issues. That is, is the conversion from `wrapped_int`
to `int` like a user-defined conversion, or is it like converting a
`short` to an `int`? How does it behave with overload resolution and
scoring? And so forth.
I don't think you'd need implicit conversion for those at all.
But I'm not a fan of implicit conversion anyway.
Post by Jonathan Müller
The thing is, we don't really /want/ to treat a wrapped integer as a
different type from `int`. What we /really/ want is to treat an
/expression/ as wrapped or unwrapped.
I disagree, but don't have a strong opinion there.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/278b25ca-746e-abca-75b2-3a2b7112eb1a%40gmail.com.
Nicol Bolas
2018-02-24 19:44:47 UTC
Permalink
Post by Jonathan Müller
On Saturday, February 24, 2018 at 11:08:43 AM UTC-5, Jonathan MÃŒller
Why make it keywords though?
Because otherwise they'd have to be /types/. And the whole point of his
suggestion was that they're not different types. There is no such thing
as a `wrapped int` type; there's just an `int`, which the compiler will
treat as "wrapped".
But what's the downside of making them types?
Wrapping behavior is not really a property of object; it's a property of
how you *use* the object. By putting it in the object type, you now create
oddities. For example, you can overload on `wrapped_int` vs. `int`, which
can create *very* strange behavior. Is that a useful thing?

I've heard of languages where you have `+` for addition with UB overflow
Post by Jonathan Müller
and something like `+%` for addition with modulo. But I don't think it
makes that much sense: either you want addition with UB for all addition
operations on an object, or modulo for all. The only exception - I can
think of - would be in the implementation of an `would_overflow(a, b)`,
but there you can cast.
I think the main advantage of the untyped approach is that it neatly
ducks conversion issues. That is, is the conversion from `wrapped_int`
to `int` like a user-defined conversion, or is it like converting a
`short` to an `int`? How does it behave with overload resolution and
scoring? And so forth.
I don't think you'd need implicit conversion for those at all.
Consider this:

wrapped_int i = ...;
auto j = i + 5;

If that doesn't get me proper wrapping behavior, or worse if it is a
compile error due to the lack of conversion, then something is wrong with
your `wrapped_int`.
Post by Jonathan Müller
But I'm not a fan of implicit conversion anyway.
The thing is, we don't really /want/ to treat a wrapped integer as a
different type from `int`. What we /really/ want is to treat an
/expression/ as wrapped or unwrapped.
I disagree, but don't have a strong opinion there.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/016893a1-f1a5-4db0-9852-68be9c822a5d%40isocpp.org.
Tony V E
2018-02-24 22:10:15 UTC
Permalink
Post by Nicol Bolas
Post by Jonathan Müller
On Saturday, February 24, 2018 at 11:08:43 AM UTC-5, Jonathan MÃŒller
Why make it keywords though?
Because otherwise they'd have to be /types/. And the whole point of his
suggestion was that they're not different types. There is no such thing
as a `wrapped int` type; there's just an `int`, which the compiler will
treat as "wrapped".
But what's the downside of making them types?
Wrapping behavior is not really a property of object; it's a property of
how you *use* the object.
A type is an interpretation of bytes, plus allowable operations on those
bytes.
A wrapping int and a non-wrapping int have the same interpretations (the
number 17 is stored the same for both, I imagine), but they do different
operations (or do the same operations differently).
So they are different types.

Or make a single type with extra operations (+<, +%, etc)
Post by Nicol Bolas
By putting it in the object type, you now create oddities. For example,
you can overload on `wrapped_int` vs. `int`, which can create *very*
strange behavior. Is that a useful thing?
I've heard of languages where you have `+` for addition with UB overflow
Post by Jonathan Müller
and something like `+%` for addition with modulo. But I don't think it
makes that much sense: either you want addition with UB for all addition
operations on an object, or modulo for all. The only exception - I can
think of - would be in the implementation of an `would_overflow(a, b)`,
but there you can cast.
I think the main advantage of the untyped approach is that it neatly
ducks conversion issues. That is, is the conversion from `wrapped_int`
to `int` like a user-defined conversion, or is it like converting a
`short` to an `int`? How does it behave with overload resolution and
scoring? And so forth.
I don't think you'd need implicit conversion for those at all.
wrapped_int i = ...;
auto j = i + 5;
If that doesn't get me proper wrapping behavior, or worse if it is a
compile error due to the lack of conversion, then something is wrong with
your `wrapped_int`.
Post by Jonathan Müller
But I'm not a fan of implicit conversion anyway.
The thing is, we don't really /want/ to treat a wrapped integer as a
different type from `int`. What we /really/ want is to treat an
/expression/ as wrapped or unwrapped.
I disagree, but don't have a strong opinion there.
--
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an
To view this discussion on the web visit https://groups.google.com/a/
isocpp.org/d/msgid/std-proposals/016893a1-f1a5-4db0-
9852-68be9c822a5d%40isocpp.org
<https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/016893a1-f1a5-4db0-9852-68be9c822a5d%40isocpp.org?utm_medium=email&utm_source=footer>
.
--
Be seeing you,
Tony
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbive7KpLK8c3f0z9%3DMr8YfZxM%3DiDCf6Yhdn2_e7%2B4V0ZtA%40mail.gmail.com.
Nicol Bolas
2018-02-24 23:04:16 UTC
Permalink
Post by Tony V E
Post by Nicol Bolas
Post by Jonathan Müller
On Saturday, February 24, 2018 at 11:08:43 AM UTC-5, Jonathan MÃŒller
Why make it keywords though?
Because otherwise they'd have to be /types/. And the whole point of
his
suggestion was that they're not different types. There is no such
thing
as a `wrapped int` type; there's just an `int`, which the compiler
will
treat as "wrapped".
But what's the downside of making them types?
Wrapping behavior is not really a property of object; it's a property of
how you *use* the object.
A type is an interpretation of bytes, plus allowable operations on those
bytes.
A wrapping int and a non-wrapping int have the same interpretations (the
number 17 is stored the same for both, I imagine), but they do different
operations (or do the same operations differently).
But they don't even do things differently per-se. If `int` and
`wrapped_int` are both 2's complement, the only difference between them is
how the *compiler* interprets an overflow. In one case, overflow is
considered UB; in the other, it has well-defined behavior.

It's not a property of the operation itself; it's a property of the
resulting value of that operation. That's why I'm thinking that this is
best handled by specifying that you want the *expression* to wrap signed
overflow:

auto x = wrapped{a + b};

This would mean that any math operation on integers within that boundary
wraps overflow, according to the rules of 2's complement.

And you ought to be able to do this for statements. Lots of statements:

wrapped
{
auto math = a + b;
auto is = math - c;
auto being = a + math;
auto done = c + being + is;
}

I don't know if this is worthy of a keyword, but the general idea ought to
be that you designate the explicit operations you want to do this on, not
the types or objects. And if we want to create a `std::wrapped_int` that
wraps a regular `int` and forwards all expressions with `wrapped{}`, that's
fine.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/46b2d1b6-4516-4916-892e-4176e9bda452%40isocpp.org.
Arthur O'Dwyer
2018-02-24 23:06:05 UTC
Permalink
Post by Tony V E
Post by Nicol Bolas
Post by Jonathan Müller
But what's the downside of making them types?
Wrapping behavior is not really a property of object; it's a property of
how you *use* the object.
A type is an interpretation of bytes, plus allowable operations on those
bytes.
A wrapping int and a non-wrapping int have the same interpretations (the
number 17 is stored the same for both, I imagine), but they do different
operations (or do the same operations differently).
So they are different types.
Or make a single type with extra operations (+<, +%, etc)
Tony is absolutely correct.

Note that the extra operations (if any) do not *have* to be spelled +<, +%,
etc. In typical C and C++ codebases, they are spelled as functions:
wrapping_add, safe_add, whatever.

An excruciatingly detailed set of "safe math" functions is available in the
Csmith include files:
https://github.com/csmith-project/csmith/tree/master/runtime
although unfortunately in the repo they're expressed as a bunch of M4
macros that have to be macro-processed before use (i.e., I can't give a URL
directly to the functions because they don't exist in the Csmith repo as
such).

–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/236d0854-7aea-402d-a04b-9a3cf6884cfb%40isocpp.org.
JF Bastien
2018-02-26 17:16:32 UTC
Permalink
Post by JF Bastien
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will make sure any other approaches are discussed thoroughly. I’ll then ask for direction polls from EWG, and will follow the proposed direction in an updated paper.
And if you manage to write the updated paper overnight and have it back on the table the next day at JAX, then my paper will be utterly superfluous. :)
I am, however, worried that the writing of a new paper might slip more than a day, which would end up with you coming back in the pre-Rapperswil mailing with another two's-complement paper after the perception that your first two's-complement paper was "rejected" in Jacksonville, which would set a perceived negative precedent in people's minds.
I’ve frequently presented updated papers after obtaining feedback: evenings are for paper writing.
And if you manage to write the updated paper overnight and have it back on the table the next day at JAX, then my paper will be utterly superfluous. :)
Post by JF Bastien
- Unintentional unsigned wraparound (for example, in the argument to `malloc`) has been a known source of bugs for a long time. See for example [Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>, whose final sentence is, "Our results also imply that tools for detecting integer numerical errors need to distinguish intentional from unintentional uses of wraparound operations — a challenging task — in order to minimize false alarms. [emphasis added]" The current undefinedness of signed overflow permits implementations, such as UBSan, to detect all signed wraparound behavior as unintentional by definition, and diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB because of the 3 representation, not to catch bugs, otherwise unsigned overflow would also have been UB. [...] I claim that this is also an emergent feature, not by design, caused by the 3 signed integer representations.
I think you are right in the historical sense. These days the emergent "catch bugs" rationale has survived, though, even as the exotic hardwares have died out.
-fsanitize=unsigned-integer-overflow: Unsigned integer overflows. Note that unlike signed integer overflow, unsigned integer is not undefined behavior. However, while it has well-defined semantics, it is often unintentional, so UBSan offers to catch it.
This is very cool; I was unaware of this.
Your paper would benefit from mentioning this. But the obvious comeback is: where's the numbers on how many false positives UBSan generates in this mode? That number cannot possibly be zero.
The false positive rate is empirically huge, and extremely painful because there is no easy syntactic way to distinguish between true and false positives.
I’ve asked Chandler but I’ll ask you as well: data would be great in getting committee consensus. To me “false positive” sounds an awful lot like “not a bug” in this context :-)
Post by JF Bastien
Post by JF Bastien
- The impossibility of signed wraparound allows optimization of tight inner loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a` the loop would eventually overflow and invoke undefined behavior.
I claim that this is also an emergent feature, not by design, caused by the 3 signed integer representations. I also claim that much of this performance can be regained with a better optimizer (the compiler I work on certainly optimizes loops substantially without assuming UB on overflow). Further, the Internet shows that this optimization isn’t something developers knowingly opt into, and when they hit it they are surprised by the bugs it generates.
Users never opt into bugs by definition. But users notice performance regressions (in the compiler) almost as quickly as they notice correctness regressions (in their own code).
Post by JF Bastien
This is intuitively the same behavior that we have with C++ iterators: the compiler is allowed to assume that the existence of a loop over the range `a` to `b` implies that `b` is actually reachable from `a` according to forward-increment semantics, even though in practice many implementations' std::list::iterator internally performs the equivalent of "wrap on overflow." (See the graphical diagram of "sentinel-node containers" in P0773 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B> if needed.)
John McFarlane did a lightning talk about integer UB and codegen within the past year but I don't know if the slides are somewhere. I can ask him.
The intuition is reversed, though: C came before C++ and iterators.
Again, historically accurate but this is not modern C++'s problem. We don't teach C before C++. (And if we did, we might soon have to explain that integer overflow is undefined in C but not in C++? Add "C source-level compatibility" to the list of rationales for preserving C's undefined overflow behavior in C++.)
Post by JF Bastien
I mean, it's not like there's any shortage of educational material on UB in C and C++ and its good and bad effects.
Sure, do you believe there are particular references that should be read with my proposal?
Off the top of my head, I recall John McFarlane's lightning talk on codegen, Michael Spencer's "My Little Optimizer: Undefined Behavior Is Magic" talk, and pretty much anything involving John Regehr. I found the link to Regehr2012 as one of the top Google hits for "unintentional unsigned overflow".
Post by JF Bastien
What there is a shortage of IMHO is material on ones'-complement in C and C++. That's why I kept large swaths of your paper intact in my fork. :)
Might the lack of such documentation be caused by a lack of ones’ complement hardware using modern C++?
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with code-generators because this does happen in practice.
Removing ones'-complement from C++ will be as painless (or painful) as removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as removing type-based alias analysis would be.
–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0L%2BBDk3TMYX6znEe0RMbEPZfYGzEWDLjOXLB60EyR5R6g%40mail.gmail.com <https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0L%2BBDk3TMYX6znEe0RMbEPZfYGzEWDLjOXLB60EyR5R6g%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/9A5AF8C9-8FEF-417E-9DB3-E5A8790B5892%40apple.com.
Nevin Liber
2018-02-26 17:50:05 UTC
Permalink
Post by JF Bastien
I’ve asked Chandler but I’ll ask you as well: data would be great in
getting committee consensus. To me “false positive” sounds an awful lot
like “not a bug” in this context :-)
I don't understand this line of reasoning. If you consider *unsigned*
wrapping detection to be a good thing in the sanitizer (I assume that is
what you mean by "not a bug"), why on earth do you want *signed* overflow
to wrap? Do you expect sanitizers to detect this? If so, why make it
legitimate behavior?

Without wrapping, I don't see enough motivation to change the status quo,
especially since it needlessly introduces an incompatibility with C. And I
certainly don't see how this is in the "top 20" things for C++20.


IMO, if the sanitizers are giving false positives, they should remove the
misleading bit about no false positives as a goal in the documentation <
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#issue-suppression
Post by JF Bastien
.
--
Nevin ":-)" Liber <mailto:***@eviloverlord.com> +1-847-691-1404
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGg_6%2BOp%3DC0FtNeJ4MY7NTKZBiYP76U-HCyp3%3DGYhYf11HHtvQ%40mail.gmail.com.
'Matt Calabrese' via ISO C++ Standard - Future Proposals
2018-02-26 18:46:04 UTC
Permalink
Post by Nevin Liber
Post by JF Bastien
I’ve asked Chandler but I’ll ask you as well: data would be great in
getting committee consensus. To me “false positive” sounds an awful lot
like “not a bug” in this context :-)
I don't understand this line of reasoning. If you consider *unsigned*
wrapping detection to be a good thing in the sanitizer (I assume that is
what you mean by "not a bug"), why on earth do you want *signed* overflow
to wrap? Do you expect sanitizers to detect this? If so, why make it
legitimate behavior?
Agreed. Simultaneously saying "signed arithmetic wraps" and encouraging
sanitizers to check for overflow is contradictory. Defining the behavior is
a statement that users can rely on that behavior (otherwise why would you
standardize it at all?). Having a sanitizer then check to make sure that
people aren't overflowing means that people shouldn't rely on it, otherwise
what you get is a sanitizer that has false positives, giving incentive for
people to *not* use it. The end state that you are left with is fewer
people using the sanitizer (leaving bugs latent) while also preventing
optimizations. What exactly is the point, here?

I do not see how this is at all a good direction for the standard. If you
want signed arithmetic to wrap on your implementation, then that is
something you can request from your compiler. I'm all for introducing new
functions for a wrapping add or something along those lines, but trying to
make + wrap, especially with encouragement for sanitizers to still check
for overflow, seems like all negatives to me.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CANh8DEm%2Bf07nSuXG8kPLFFw_rz9HheBgFecb5Vif2cG76NH_fw%40mail.gmail.com.
Nevin Liber
2018-02-26 19:34:07 UTC
Permalink
On Mon, Feb 26, 2018 at 12:46 PM, 'Matt Calabrese' via ISO C++ Standard -
Having a sanitizer then check to make sure that people aren't overflowing
means that people shouldn't rely on it, otherwise what you get is a
sanitizer that has false positives, giving incentive for people to *not*
use it. The end state that you are left with is fewer people using the
sanitizer (leaving bugs latent) while also preventing optimizations. What
exactly is the point, here?
Yup. My fear is that, like compiler warnings, it migrates to becoming a
style enforcer rather than a problem finder. We jump through hoops to get
our code to compile without warnings, which ranges from painful to nearly
impossible if you use multiple compilers (just ask the Boost folks). But
that is an issue for SG15, not std-proposals...
--
Nevin ":-)" Liber <mailto:***@eviloverlord.com> +1-847-691-1404
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGg_6%2BNOtR8p-QKid7%2Bru07TDz2%2B5YiCDhAe-M%2BWQzOy_APmxg%40mail.gmail.com.
Andrey Semashev
2018-02-26 19:03:53 UTC
Permalink
Post by Nevin Liber
Without wrapping, I don't see enough motivation to change the status
quo, especially since it needlessly introduces an incompatibility with
C.  And I certainly don't see how this is in the "top 20" things for C++20.
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it. Currently,
there simply isn't a way to cast from unsigned to signed portably and
fail-safe. The whole "let's use signed integers everywhere" incentive
exists because of this (and therefore is misguided, IMHO).

Personally, I want the language to provide tools with well defined
overflow semantics as well. These may not be the current signed integer
types, but certainly they should be types with overloaded arithmetic
operators (free functions and such are just too verbose for no reason).
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/faae0903-eb2a-f8ef-fe11-f0477a302693%40gmail.com.
Nevin Liber
2018-02-26 19:27:25 UTC
Permalink
Post by Andrey Semashev
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast from
unsigned to signed without UB alone makes it worth it. Currently, there
simply isn't a way to cast from unsigned to signed portably and fail-safe.
But we can add functions to perform that.


Either the bar for breaking C compatibility should be high, or we should
deliberately decide that C compatibility is not longer a goal of C++. This
paper doesn't fit into either of those categories.
Post by Andrey Semashev
The whole "let's use signed integers everywhere" incentive exists because
of this (and therefore is misguided, IMHO).
There are a bunch of reasons for it, a few of which have been mentioned in
this thread (sanitizers, optimization, etc.).
Post by Andrey Semashev
Personally, I want the language to provide tools with well defined
overflow semantics as well. These may not be the current signed integer
types, but certainly they should be types with overloaded arithmetic
operators (free functions and such are just too verbose for no reason).
I'd probably support such a proposal, as it meets my criterion that it
indicates when one deliberately wants the overflow semantics vs.
accidentally depending on the overflow semantics (which is what we would
get if we adopted the paper w/o any changes). (Of course, some people will
use it everywhere, but that is an education problem.)
--
Nevin ":-)" Liber <mailto:***@eviloverlord.com> +1-847-691-1404
<(847)%20691-1404>
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGg_6%2BPMO1UW59SJ-XT_iM_xt1bqGbVjWmvH0R2Gi%3Do6cmJyTw%40mail.gmail.com.
Andrey Semashev
2018-02-26 19:40:44 UTC
Permalink
On Mon, Feb 26, 2018 at 1:03 PM, Andrey Semashev
Having two's complement signed in the standard is worth it even
without well defined overflow semantics. Just the guaranteed ability
to cast from unsigned to signed without UB alone makes it worth it.
Currently, there simply isn't a way to cast from unsigned to signed
portably and fail-safe.
But we can add functions to perform that.
I don't see how such a function could be fail-safe and portable (that
is, produce the same result regardless of the signed integer
representation) because different representations don't have equivalent
ranges of values.

Further, why should it be a dedicated function when we have static_cast?
What would be the use case to use the old and unreliable static_cast and
not just always use this conversion function? Also, a special function
is difficult to use in generic code and cannot be used when the cast is
implicit (which, presumably, would still be unreliable).

No, IMHO this problem needs to be solved in its source, which is
incompatibility between signed and unsigned integers.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/5a33b1cc-a041-ef97-bbe8-5286d7ee9cad%40gmail.com.
Nevin Liber
2018-02-26 19:51:37 UTC
Permalink
I don't see how such a function could be fail-safe and portable (that is,
produce the same result regardless of the signed integer representation)
because different representations don't have equivalent ranges of values.
If it doesn't exist, you can't do it portably. Seems that would be easier
to pass than being incompatible with C just to support this.
Further, why should it be a dedicated function when we have static_cast?
I want the function anyway. casting is a many-to-one operation, which
makes it error-prone and clunky to use:

static_cast<make_signed_t<type>>(i)

is less preferable to something like

convert_to_unsigned(i)

No, IMHO this problem needs to be solved in its source, which is
incompatibility between signed and unsigned integers.
They have different ranges of valid values, so you cannot "make them
compatible" without making some set of users unhappy.
--
Nevin ":-)" Liber <mailto:***@eviloverlord.com> +1-847-691-1404
<(847)%20691-1404>
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGg_6%2BMo43OqqdnQ3aRxpwvhdUwKZLLASjsU_d1u6D4BbALomA%40mail.gmail.com.
Andrey Semashev
2018-02-26 20:22:02 UTC
Permalink
On Mon, Feb 26, 2018 at 1:40 PM, Andrey Semashev
I don't see how such a function could be fail-safe and portable
(that is, produce the same result regardless of the signed integer
representation) because different representations don't have
equivalent ranges of values.
If it doesn't exist, you can't do it portably.  Seems that would be
easier to pass than being incompatible with C just to support this.
Given that two's complement is ubiquitous, I don't think this
incompatibility will have much significance. And who knows, maybe C will
follow.
Further, why should it be a dedicated function when we have static_cast?
I want the function anyway.  casting is a many-to-one operation, which
static_cast<make_signed_t<type>>(i)
is less preferable to something like
convert_to_unsigned(i)
Again, this doesn't work with implicit casts.
No, IMHO this problem needs to be solved in its source, which is
incompatibility between signed and unsigned integers.
They have different ranges of valid values, so you cannot "make them
compatible" without making some set of users unhappy.
They can be made compatible in the sense of casts preserving bitwise
value. For that to be portable, signed representation has to be fixed.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/27759c7c-b47d-20e8-beb1-37a85d99e928%40gmail.com.
Arthur O'Dwyer
2018-02-26 20:20:09 UTC
Permalink
@Andrey: Agreed, I'd like to define the conversion behavior of
int(0xFFFFFFFF).
@Matt: Agreed, building "wrap on overflow" into operator+ seems like a bad
idea. I mean, it's a great idea if you know that wrapping is the behavior
you want and you're relying on wrapping — but it's a terrible idea if
you're not even thinking about the possibility of overflow and suddenly
your code starts doing the wrong thing (wrapping on overflow) and you can't
figure out why or where because there's no way to distinguish "intentional"
wrapping from "accidental" wrapping.

The distinction here is that integral conversions, truncations, and bitwise
operations & | ^ ~ << >> depend fundamentally on the representation of
signed integers, which in practice is always two's-complement. The numeric
arithmetic operations + - * / % do not fundamentally depend on
*representation*; they depend on some other property of the mapping from
the mathematical number line onto a C++ data type.
I have tried to reflect this theory a little better in the latest draft of
my "conservative" paper:
https://quuxplusone.github.io/draft/twosc-conservative.html

Also in my latest draft, I've added two tables. Table 1 shows C++
expressions with their current behavior, the behavior that would be
mandated under my revised proposal, and the name of the WD section with the
relevant wording. Table 2 shows some more C++ expressions which also
currently have undefined behavior, and which would continue to have
undefined behavior after my proposal. My understanding is that JF's
proposal — which, again, is the one that will be shown to EWG — would
mandate defined behavior for the first two examples in Table 2.

–Arthur
Post by Andrey Semashev
Post by Nevin Liber
Without wrapping, I don't see enough motivation to change the status quo,
especially since it needlessly introduces an incompatibility with C. And I
certainly don't see how this is in the "top 20" things for C++20.
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast from
unsigned to signed without UB alone makes it worth it. Currently, there
simply isn't a way to cast from unsigned to signed portably and fail-safe.
The whole "let's use signed integers everywhere" incentive exists because
of this (and therefore is misguided, IMHO).
Personally, I want the language to provide tools with well defined
overflow semantics as well. These may not be the current signed integer
types, but certainly they should be types with overloaded arithmetic
operators (free functions and such are just too verbose for no reason).
--
You received this message because you are subscribed to a topic in the
Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/is
ocpp.org/d/topic/std-proposals/MZzCyAL1qRo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
To view this discussion on the web visit https://groups.google.com/a/is
ocpp.org/d/msgid/std-proposals/faae0903-eb2a-f8ef-fe11-
f0477a302693%40gmail.com.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0JOUimwTL9idWRtTwBTj4rwe_Oq-NFmvR%2BVbN06ooFo0w%40mail.gmail.com.
Thiago Macieira
2018-02-26 23:03:54 UTC
Permalink
Post by Andrey Semashev
Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it.
That's never been UB. But it is IB:

http://eel.is/c++draft/conv.integral#3
"If the destination type is signed, the value is unchanged if it can be
represented in the destination type; otherwise, the value is implementation-
defined."
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3792531.hUIyXVzBxz%40tjmaciei-mobl1.
Andrey Semashev
2018-02-27 01:06:52 UTC
Permalink
Post by Thiago Macieira
Post by Andrey Semashev
Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it.
http://eel.is/c++draft/conv.integral#3
"If the destination type is signed, the value is unchanged if it can be
represented in the destination type; otherwise, the value is implementation-
defined."
Right. It doesn't make it portable, though, so there's no practical
difference.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/bba79176-f0cb-c4eb-d0cb-0c3fe6977e3c%40gmail.com.
Edward Catmur
2018-02-26 23:05:20 UTC
Permalink
<javascript:>
Post by Nevin Liber
Without wrapping, I don't see enough motivation to change the status
quo, especially since it needlessly introduces an incompatibility with
C. And I certainly don't see how this is in the "top 20" things for
C++20.
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it. Currently,
there simply isn't a way to cast from unsigned to signed portably and
fail-safe. The whole "let's use signed integers everywhere" incentive
exists because of this (and therefore is misguided, IMHO).
Did you mean UB? Unsigned to signed conversion is UD, as far as I recall.

Wouldn't it be enough to require the signed-unsigned-signed conversion to
round trip (losing the sign of negative zero)?

Personally, I want the language to provide tools with well defined
overflow semantics as well. These may not be the current signed integer
types, but certainly they should be types with overloaded arithmetic
operators (free functions and such are just too verbose for no reason).
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/0f5ca8b7-6e06-4065-8492-fc6bb4cca188%40isocpp.org.
Arthur O'Dwyer
2018-02-26 23:22:31 UTC
Permalink
Post by Edward Catmur
Post by Nevin Liber
Post by Nevin Liber
Without wrapping, I don't see enough motivation to change the status
quo, especially since it needlessly introduces an incompatibility with
C. And I certainly don't see how this is in the "top 20" things for
C++20.
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it. Currently,
there simply isn't a way to cast from unsigned to signed portably and
fail-safe. The whole "let's use signed integers everywhere" incentive
exists because of this (and therefore is misguided, IMHO).
Did you mean UB? Unsigned to signed conversion is UD, as far as I recall.
Wouldn't it be enough to require the signed-unsigned-signed conversion to
round trip (losing the sign of negative zero)?
Unsigned-to-signed conversion is UB only when there are enum types involved.
I have added two tables to my paper that explain the current situation and
the proposed situation(s), with examples:
https://quuxplusone.github.io/draft/twosc-conservative.html#intro

HTH,
Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0KCTF%3DokEx0yhH1TJL4gAQMmNWLN15a7%3DgBuqkM6f0P%2Bw%40mail.gmail.com.
Thiago Macieira
2018-02-27 00:10:21 UTC
Permalink
Post by Arthur O'Dwyer
Unsigned-to-signed conversion is UB only when there are enum types involved.
I have added two tables to my paper that explain the current situation and
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
Thanks for the update.

By the way, where you say "Notice that atomic integral types are already two’s
complement and have no undefined results; therefore even freestanding
implementations must already support two’s complement somehow.", that's also
true for some operations on non-atomics. The conversion from signed to
unsigned is basically a two's complement no-op, since it requires

unsigned(-1) =
unsigned(infinite_precision(UINT_MAX + 1) - 1) =
UINT_MAX
(which is ~0, or 0xFFFFFFFF on 32-bit ints)

An implementation that used one's complement negative values would need to
mutate the bit pattern by adding 1.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2053653.IsSx0ViyDN%40tjmaciei-mobl1.
Andrey Semashev
2018-02-27 01:04:49 UTC
Permalink
Post by Thiago Macieira
Post by Arthur O'Dwyer
Unsigned-to-signed conversion is UB only when there are enum types involved.
I have added two tables to my paper that explain the current situation and
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
Thanks for the update.
By the way, where you say "Notice that atomic integral types are already two’s
complement and have no undefined results; therefore even freestanding
implementations must already support two’s complement somehow.", that's also
true for some operations on non-atomics. The conversion from signed to
unsigned is basically a two's complement no-op, since it requires
unsigned(-1) =
unsigned(infinite_precision(UINT_MAX + 1) - 1) =
UINT_MAX
(which is ~0, or 0xFFFFFFFF on 32-bit ints)
An implementation that used one's complement negative values would need to
mutate the bit pattern by adding 1.
The conversion basically adds the mathematical value of the signed
integer to the unsigned two's complement 0. It doesn't require the
signed integer to have a two's complement representation if the machine
can perform such heterogeneous addition (or an equivalent conversion).
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/97daeb9e-fe0b-2c4b-b4b2-741ca2714a04%40gmail.com.
Andrey Semashev
2018-02-27 00:35:08 UTC
Permalink
Post by Arthur O'Dwyer
Unsigned-to-signed conversion is UB only when there are enum types involved.
I have added two tables to my paper that explain the current situation
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
You may want to add a peculiar use case to the proposal: IIUC, it should
fix the overflow caused by the expression:

long long x = -9223372036854775808;

(Explanation: the 9223372036854775808 literal is a 64-bit number with
the most significant bit set; this number does not fit in 64-bit signed
long long, but does fit in unsigned. The result of the negation is still
unsigned, 9223372036854775808, which still cannot be stored in the
signed long long without invoking implementation-defined behavior. The
peculiar part is that the negative value -9223372036854775808, which the
user intended from the start, is representable in two's complement
signed long long.)
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/5fd8159b-f69a-9009-a736-f1ab6e7af7d4%40gmail.com.
Arthur O'Dwyer
2018-02-27 01:28:13 UTC
Permalink
Post by Andrey Semashev
Post by Arthur O'Dwyer
Unsigned-to-signed conversion is UB only when there are enum types involved.
I have added two tables to my paper that explain the current situation
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
You may want to add a peculiar use case to the proposal: IIUC, it should
long long x = -9223372036854775808;
(Explanation: the 9223372036854775808 literal is a 64-bit number with the
most significant bit set; this number does not fit in 64-bit signed long
long, but does fit in unsigned. The result of the negation is still
unsigned, 9223372036854775808, which still cannot be stored in the signed
long long without invoking implementation-defined behavior. The peculiar
part is that the negative value -9223372036854775808, which the user
intended from the start, is representable in two's complement signed long
long.)
Thanks — updated!
(And I notice that GCC and Clang do subtly different things with that
expression. On Clang -9223372036854775808 is unsigned and the conversion
unfolds as you described; on GCC -9223372036854775808 is actually of type
signed __int128, so it gets a signed-to-signed conversion.... which is
still implementation-defined, though.)

–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0LJvgZRNz-qrxErwyoc_nT4cDh%2BgfYh4LB3iH_h7WenTg%40mail.gmail.com.
Thiago Macieira
2018-02-27 01:33:57 UTC
Permalink
Post by Arthur O'Dwyer
unfolds as you described; on GCC -9223372036854775808 is actually of type
signed __int128, so it gets a signed-to-signed conversion.... which is
still implementation-defined, though.)
Only if the target type cannot represent the value. In this case, it can.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2140195.afrQ3jm2S5%40tjmaciei-mobl1.
Arthur O'Dwyer
2018-02-27 02:02:02 UTC
Permalink
Post by Thiago Macieira
Post by Arthur O'Dwyer
unfolds as you described; on GCC -9223372036854775808 is actually of type
signed __int128, so it gets a signed-to-signed conversion.... which is
still implementation-defined, though.)
Only if the target type cannot represent the value. In this case, it can.
Oops! Good catch. Fixed. Thanks.

–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0%2B1vEB259kREjBHD-Y5y98fTSHbvDp2xuw6EWRMpR1%2BQQ%40mail.gmail.com.
Chris Hallock
2018-02-27 23:32:44 UTC
Permalink
Post by Arthur O'Dwyer
Post by Arthur O'Dwyer
Unsigned-to-signed conversion is UB only when there are enum types
involved.
Post by Arthur O'Dwyer
I have added two tables to my paper that explain the current situation
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
You may want to add a peculiar use case to the proposal: IIUC, it should
long long x = -9223372036854775808;
(Explanation: the 9223372036854775808 literal is a 64-bit number with
the most significant bit set; this number does not fit in 64-bit signed
long long, but does fit in unsigned. The result of the negation is still
unsigned, 9223372036854775808, which still cannot be stored in the
signed long long without invoking implementation-defined behavior. The
peculiar part is that the negative value -9223372036854775808, which the
user intended from the start, is representable in two's complement
signed long long.)
Strictly speaking, 9223372036854775808 is ill-formed on current
implementations because its form (decimal literal with no suffix) restricts
it to signed, but no signed type is big enough; see [lex.icon]/2-3
<https://timsong-cpp.github.io/cppwp/lex.icon#2>. GCC and Clang correctly
issue a warning (or error, under -pedantic-errors) for this.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/09c8caec-12e1-4327-93ca-1771202cb367%40isocpp.org.
Andrey Semashev
2018-02-27 01:13:22 UTC
Permalink
Post by Andrey Semashev
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it. Currently,
there simply isn't a way to cast from unsigned to signed portably and
fail-safe. The whole "let's use signed integers everywhere" incentive
exists because of this (and therefore is misguided, IMHO).
Did you mean UB? Unsigned to signed conversion is UD, as far as I recall.
Right, it's implementation-defined. Not much better, though.
Post by Andrey Semashev
Wouldn't it be enough to require the signed-unsigned-signed conversion
to round trip (losing the sign of negative zero)?
This would not provide portable conversion from unsigned to signed
negative values.

And why should the conversion lose any information?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7080fb4a-5691-d19b-df79-aa6a40eecf63%40gmail.com.
'Edward Catmur' via ISO C++ Standard - Future Proposals
2018-02-27 08:51:42 UTC
Permalink
Post by Andrey Semashev
Post by Andrey Semashev
Having two's complement signed in the standard is worth it even without
well defined overflow semantics. Just the guaranteed ability to cast
from unsigned to signed without UB alone makes it worth it. Currently,
there simply isn't a way to cast from unsigned to signed portably and
fail-safe. The whole "let's use signed integers everywhere" incentive
exists because of this (and therefore is misguided, IMHO).
Did you mean UB? Unsigned to signed conversion is UD, as far as I recall.
Right, it's implementation-defined. Not much better, though.
It makes a big difference; it's usable in constexpr, for one. Also, it
isn't diagnosable by ubsan in normal modes.
Post by Andrey Semashev
Wouldn't it be enough to require the signed-unsigned-signed conversion to
Post by Andrey Semashev
round trip (losing the sign of negative zero)?
This would not provide portable conversion from unsigned to signed
negative values.
It would provide portable conversion for every value in unsigned that is
the result of a conversion from a value in signed, which is every value in
unsigned if signed is two's complement and every value except 0x80...
otherwise. In what circumstances would that be insufficient?
Post by Andrey Semashev
And why should the conversion lose any information?
It's the signed-to-unsigned conversion that loses information; this is
because we want zero and negative zero to be treated similarly. It's the
same for conversions from floating point to integral.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAJnLdOb7HAgKoynSH69nbyT3hrVYqGugb_BSH%3D-JsjS3-qynyw%40mail.gmail.com.
Andrey Semashev
2018-02-27 10:01:14 UTC
Permalink
On 02/27/18 11:51, 'Edward Catmur' via ISO C++ Standard - Future
On Tue, Feb 27, 2018 at 1:13 AM, Andrey Semashev
Wouldn't it be enough to require the signed-unsigned-signed
conversion to round trip (losing the sign of negative zero)?
This would not provide portable conversion from unsigned to signed
negative values.
It would provide portable conversion for every value in unsigned that is
the result of a conversion from a value in signed, which is every value
in unsigned if signed is two's complement and every value except 0x80...
otherwise. In what circumstances would that be insufficient?
See my -9223372036854775808 example in the reply to Arthur O'Dwyer.

Also, the conversions of unsigned to signed negative need to be well
defined and portable if you want to implement signed overflow in a
library. You would normally convert to unsigned, perform arithmetics and
then memcpy the result to signed (because a regular cast is
implementation-defined, which basically gives no guarantees on the
result). For that to provide a portable and meaningful result, signed
representation must be two's complement.
And why should the conversion lose any information?
It's the signed-to-unsigned conversion that loses information; this is
because we want zero and negative zero to be treated similarly. It's the
same for conversions from floating point to integral.
Floating point numbers are not representable in integers in general,
hence the inherent loss of information on conversion. Signed integers
are much closer to unsigned integers in that the number of distinct
values is the same and non-negative values are represented the same way.
Therefore the conversion can be made without loss of information.

Anyway, I don't think negative zero is relevant as I'm in favor of
mandating two's complement representation.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/4dce3fb4-405a-f5bb-6f43-2f0c56e199e9%40gmail.com.
'Edward Catmur' via ISO C++ Standard - Future Proposals
2018-02-27 11:19:13 UTC
Permalink
On 02/27/18 11:51, 'Edward Catmur' via ISO C++ Standard - Future Proposals
On Tue, Feb 27, 2018 at 1:13 AM, Andrey Semashev <
Wouldn't it be enough to require the signed-unsigned-signed
conversion to round trip (losing the sign of negative zero)?
This would not provide portable conversion from unsigned to signed
negative values.
It would provide portable conversion for every value in unsigned that is
the result of a conversion from a value in signed, which is every value in
unsigned if signed is two's complement and every value except 0x80...
otherwise. In what circumstances would that be insufficient?
See my -9223372036854775808 example in the reply to Arthur O'Dwyer.
Negating 9223372036854775808 in unsigned gives 9223372036854775808, which
happens to be the result of converting -9223372036854775808 into (64-bit)
unsigned. So provided a round-trip guarantee and that -9223372036854775808 is
a value in the signed type, which it is for 64-bit two's complement,
9223372036854775808 in unsigned would convert to -9223372036854775808 in
signed. That is, -9223372036854775808 would be guaranteed to result in
-9223372036854775808
if that value exists in the signed type.
Also, the conversions of unsigned to signed negative need to be well
defined and portable if you want to implement signed overflow in a library.
You would normally convert to unsigned, perform arithmetics and then memcpy
the result to signed (because a regular cast is implementation-defined,
which basically gives no guarantees on the result). For that to provide a
portable and meaningful result, signed representation must be two's
complement.
Surely the normal approach would be to wrap the compiler intrinsics for
overflow arithmetic. If they aren't available for whatever reason, tests
will confirm that the conversion behaves as expected. If you can't use the
intrinsics and can't test, how confident can you be that the compiler has
implemented the conversion correctly or indeed that you have written your
arithmetic correctly? On the other hand if the conversion is guaranteed to
round-trip then you can use a cast instead of memcpy. There is no
circumstance in which memcpy is preferable to an arithmetic conversion,
except in the entirely hypothetical situation in which you have confirmed
that the implementation uses two's complement but not that it performs the
conversion from unsigned to signed in the obvious manner.

And why should the conversion lose any information?
It's the signed-to-unsigned conversion that loses information; this is
because we want zero and negative zero to be treated similarly. It's the
same for conversions from floating point to integral.
Floating point numbers are not representable in integers in general, hence
the inherent loss of information on conversion. Signed integers are much
closer to unsigned integers in that the number of distinct values is the
same and non-negative values are represented the same way. Therefore the
conversion can be made without loss of information.
The number of distinct values is the same only if positive and negative
zero are considered distinct.
Anyway, I don't think negative zero is relevant as I'm in favor of
mandating two's complement representation.
Negative zero is the reason why the signed-unsigned conversion loses
information only if the signed type has negative zero. Obviously it is
irrelevant if the signed type does not have negative zero. In itself this
is not a reason to require two's complement.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAJnLdOaZX%3DR7%3D2LQyjdZh6vL41-Lm7R4gm-zgEsZkFGPgrNt5A%40mail.gmail.com.
Andrey Semashev
2018-02-27 12:07:32 UTC
Permalink
On 02/27/18 14:19, 'Edward Catmur' via ISO C++ Standard - Future
On Tue, Feb 27, 2018 at 10:01 AM, Andrey Semashev
On 02/27/18 11:51, 'Edward Catmur' via ISO C++ Standard - Future
On Tue, Feb 27, 2018 at 1:13 AM, Andrey Semashev
        Wouldn't it be enough to require the signed-unsigned-signed
        conversion to round trip (losing the sign of negative
zero)?
    This would not provide portable conversion from unsigned to
signed
    negative values.
It would provide portable conversion for every value in unsigned
that is the result of a conversion from a value in signed, which
is every value in unsigned if signed is two's complement and
every value except 0x80... otherwise. In what circumstances
would that be insufficient?
See my -9223372036854775808 example in the reply to Arthur O'Dwyer.
Negating 9223372036854775808 in unsigned gives 9223372036854775808,
which happens to be the result of converting -9223372036854775808 into
(64-bit) unsigned. So provided a round-trip guarantee and that
-9223372036854775808 is a value in the signed type, which it is for
64-bit two's complement, 9223372036854775808 in unsigned would convert
to -9223372036854775808 in signed. That is, -9223372036854775808 would
be guaranteed to result in -9223372036854775808 if that value exists in
the signed type.
No, the roundtrip guarantee does not provide the guarantee that the
resulting signed value will be -9223372036854775808. In other words, the
roundtrip guarantee is this:

unsigned long long x1 = 9223372036854775808;
signed long long x2 = x1; // x2 value is unknown. trap, possibly?
unsigned long long x3 = x2; // x3 == x1

What I really want, and what the original example requires, is that x2
is guaranteed to have value -9223372036854775808 and nothing else.
Also, the conversions of unsigned to signed negative need to be well
defined and portable if you want to implement signed overflow in a
library. You would normally convert to unsigned, perform arithmetics
and then memcpy the result to signed (because a regular cast is
implementation-defined, which basically gives no guarantees on the
result). For that to provide a portable and meaningful result,
signed representation must be two's complement.
Surely the normal approach would be to wrap the compiler intrinsics for
overflow arithmetic.
Compiler intrinsics or inline assembler are not portable. I want this
ability in portable legal C++.
If they aren't available for whatever reason, tests
will confirm that the conversion behaves as expected.
As long as the conversion is implementation-defined, the tests would
only verify that the implementation is doing what we expect. While that
sort of protects from unexpected results (as long as the tests are run
by the users), that is not what I would like to have as a library
developer. I want to be able to say to my users what will be the result
of operations of my library and I can't do that if I use
implementation-defined or undefined behavior in my implementation.
If you can't use
the intrinsics and can't test, how confident can you be that the
compiler has implemented the conversion correctly or indeed that you
have written your arithmetic correctly? On the other hand if the
conversion is guaranteed to round-trip then you can use a cast instead
of memcpy. There is no circumstance in which memcpy is preferable to an
arithmetic conversion, except in the entirely hypothetical situation in
which you have confirmed that the implementation uses two's complement
but not that it performs the conversion from unsigned to signed in the
obvious manner.
I consider whatever is not defined by the standard to be not portable.
Even if it is implementation-defined and if you would expect every
implementation to do it in an obvious way. If it's not defined then it
can do whatever weird stuff. OTOH, if there is only one reasonable way
to implement it then why is it not defined in the standard?

As long as I'm in the non-portable land, I can do whatever hacks the
particular compiler allows to get the job done. But I don't consider
that a good thing that every other developer should do in their code.
Therefore I want this very basic thing, integer conversion, to "just
work" and produce portable well-defined results, unsurprising to most
users. Naturally, if integer conversion is defined to work the way we
expect (i.e. to be equivalent to memcpy) the memcpy trick becomes
unnecessary.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/32edb539-a093-088d-f5fe-7c5f86c979ad%40gmail.com.
'Edward Catmur' via ISO C++ Standard - Future Proposals
2018-02-27 15:27:25 UTC
Permalink
On 02/27/18 14:19, 'Edward Catmur' via ISO C++ Standard - Future Proposals
On Tue, Feb 27, 2018 at 10:01 AM, Andrey Semashev <
On 02/27/18 11:51, 'Edward Catmur' via ISO C++ Standard - Future
On Tue, Feb 27, 2018 at 1:13 AM, Andrey Semashev
Wouldn't it be enough to require the
signed-unsigned-signed
conversion to round trip (losing the sign of negative zero)?
This would not provide portable conversion from unsigned to signed
negative values.
It would provide portable conversion for every value in unsigned
that is the result of a conversion from a value in signed, which
is every value in unsigned if signed is two's complement and
every value except 0x80... otherwise. In what circumstances
would that be insufficient?
See my -9223372036854775808 example in the reply to Arthur O'Dwyer.
Negating 9223372036854775808 in unsigned gives 9223372036854775808, which
happens to be the result of converting -9223372036854775808 into (64-bit)
unsigned. So provided a round-trip guarantee and that
-9223372036854775808 is a value in the signed type, which it is for 64-bit
two's complement, 9223372036854775808 in unsigned would convert to
-9223372036854775808 in signed. That is, -9223372036854775808 would be
guaranteed to result in -9223372036854775808 if that value exists in the
signed type.
No, the roundtrip guarantee does not provide the guarantee that the
resulting signed value will be -9223372036854775808. In other words, the
unsigned long long x1 = 9223372036854775808;
signed long long x2 = x1; // x2 value is unknown. trap, possibly?
unsigned long long x3 = x2; // x3 == x1
What I really want, and what the original example requires, is that x2 is
guaranteed to have value -9223372036854775808 and nothing else.
If x3 == x1, then x2 must have had the value -9223372036854775808 and it
cannot have been a trap representation. There is no other value in signed
64-bit that converts to 9223372036854775808.

Actually, this raises an issue with your paper: you appear to have left
open whether values of the form 0x80... (such as -9223372036854775808) are
trap representations. Was this intentional, or have I missed something?

Also, the conversions of unsigned to signed negative need to be well
defined and portable if you want to implement signed overflow in a
library. You would normally convert to unsigned, perform arithmetics
and then memcpy the result to signed (because a regular cast is
implementation-defined, which basically gives no guarantees on the
result). For that to provide a portable and meaningful result,
signed representation must be two's complement.
Surely the normal approach would be to wrap the compiler intrinsics for
overflow arithmetic.
Compiler intrinsics or inline assembler are not portable. I want this
ability in portable legal C++.
The intrinsics are available on every actively maintained compiler, and
result in clearer and more efficient code. There is little point in
expanding the space of strictly conforming programs if they are slow and
abstruse. It would be far better to standardize the intrinsics.

If they aren't available for whatever reason, tests will confirm that the
conversion behaves as expected.
As long as the conversion is implementation-defined, the tests would only
verify that the implementation is doing what we expect. While that sort of
protects from unexpected results (as long as the tests are run by the
users), that is not what I would like to have as a library developer. I
want to be able to say to my users what will be the result of operations of
my library and I can't do that if I use implementation-defined or undefined
behavior in my implementation.
You absolutely can; you either test at compile time that the
implementation-defined behavior enables your approach, or you test that the
implementation is one documented to behave as you require. Or you ask your
users to do so.

If you can't use the intrinsics and can't test, how confident can you be
that the compiler has implemented the conversion correctly or indeed that
you have written your arithmetic correctly? On the other hand if the
conversion is guaranteed to round-trip then you can use a cast instead of
memcpy. There is no circumstance in which memcpy is preferable to an
arithmetic conversion, except in the entirely hypothetical situation in
which you have confirmed that the implementation uses two's complement but
not that it performs the conversion from unsigned to signed in the obvious
manner.
I consider whatever is not defined by the standard to be not portable.
Even if it is implementation-defined and if you would expect every
implementation to do it in an obvious way. If it's not defined then it can
do whatever weird stuff. OTOH, if there is only one reasonable way to
implement it then why is it not defined in the standard?
Implementation-defined behavior is required to be documented, so you can
very quickly check whether any "weird stuff" is on the cards. And for the
most part it is detectable at compile time by user code.

As long as I'm in the non-portable land, I can do whatever hacks the
particular compiler allows to get the job done. But I don't consider that a
good thing that every other developer should do in their code. Therefore I
want this very basic thing, integer conversion, to "just work" and produce
portable well-defined results, unsurprising to most users. Naturally, if
integer conversion is defined to work the way we expect (i.e. to be
equivalent to memcpy) the memcpy trick becomes unnecessary.
If you're writing low-level code such as is dependent on unsigned-to-signed
conversion, you will tune your code to implementations so it will not be
strictly conforming anyway.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAJnLdObvtL3M_aT26zTgULEFXO19vAniZ09eQ1qs1HtzhNbKGQ%40mail.gmail.com.
Arthur O'Dwyer
2018-02-27 18:03:08 UTC
Permalink
On Tue, Feb 27, 2018 at 7:27 AM, 'Edward Catmur' via ISO C++ Standard -
On Tue, Feb 27, 2018 at 12:07 PM, Andrey Semashev <
Post by Andrey Semashev
On 02/27/18 14:19, 'Edward Catmur' via ISO C++ Standard - Future
Post by 'Edward Catmur' via ISO C++ Standard - Future Proposals
Negating 9223372036854775808 in unsigned gives 9223372036854775808,
which happens to be the result of converting -9223372036854775808 into
(64-bit) unsigned. So provided a round-trip guarantee and that
-9223372036854775808 is a value in the signed type, which it is for 64-bit
two's complement, 9223372036854775808 in unsigned would convert to
-9223372036854775808 in signed. That is, -9223372036854775808 would be
guaranteed to result in -9223372036854775808 if that value exists in the
signed type.
No, the roundtrip guarantee does not provide the guarantee that the
resulting signed value will be -9223372036854775808. In other words, the
unsigned long long x1 = 9223372036854775808;
signed long long x2 = x1; // x2 value is unknown. trap, possibly?
unsigned long long x3 = x2; // x3 == x1
What I really want, and what the original example requires, is that x2 is
guaranteed to have value -9223372036854775808 and nothing else.
If x3 == x1, then x2 must have had the value -9223372036854775808 and it
cannot have been a trap representation. There is no other value in signed
64-bit that converts to 9223372036854775808.
I observe that you two may be miscommunicating your premises. Andrey, it
seems that you want the expression `int64_t(-9223372036854775808)` to
evaluate to the numeric value -9223372036854775808. Edward, IIUC, you're
talking about present-day standard C++, where it is possible that the
number -9223372036854775808 is not representable in `int64_t` at all — in
which case clearly Andrey's goal is impossible. Andrey can't make
`int64_t(-9223372036854775808)` evaluate to -9223372036854775808 if there
is no such value in `int64_t`.

In both JF Bastien's P0907R0 and my "conservative" tweak of it, we are both
proposing that the number -9223372036854775808 should be representable in
future-C++. I think this is a useful property that programmers will enjoy
relying upon.

Edward, do you happen to have first-hand experience with any C++ platform
where -9223372036854775808 is not representable in `int64_t`? (Any
second-hand knowledge of such platforms?) This could be valuable input to
the discussion around P0907R0.


Actually, this raises an issue with your paper: you appear to have left
open whether values of the form 0x80... (such as -9223372036854775808)
are trap representations. Was this intentional, or have I missed something?
I am not sure what JF's intention is, but my guess is that he intends
integral types to have no trap representations at all.
Certainly, my own intention is for integral types to have no trap
representations at all. However, I admit that I tried and failed to figure
out where that wording lives.
What chain of reasoning in the present-day C++ standard do you think
permits implementations to make 0x80... a trap representation? If you can
point to a surviving loophole, I will do my best to close it up.
https://quuxplusone.github.io/draft/twosc-conservative.html#word

–Arthur

P.S.:

As long as the conversion is implementation-defined, the tests would only
Post by Andrey Semashev
verify that the implementation is doing what we expect. While that sort of
protects from unexpected results (as long as the tests are run by the
users), that is not what I would like to have as a library developer. I
want to be able to say to my users what will be the result of operations of
my library and I can't do that if I use implementation-defined or undefined
behavior in my implementation.
You absolutely can; you either test at compile time that the
implementation-defined behavior enables your approach, or you test that the
implementation is one documented to behave as you require. Or you ask your
users to do so.
This suggestion smells like the awkward `std::endian
<http://en.cppreference.com/w/cpp/types/endian>` feature from the C++2a
working draft. Down this road lies `if constexpr
(std::integer_representation == std::twos_complement)` and so on. I would
much rather just cut out the problem at the root, given the empirical
dearth (absence?) of present-day C++ implementations that are not
two's-complement. (Endianness is different because there empirically *do*
exist both big-endian and little-endian systems.)
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0KvcaSQvuUOAyaXwhOG_v9%2BYTK79Pec1e9bAQ7n1viDpg%40mail.gmail.com.
Andrey Semashev
2018-02-27 19:42:32 UTC
Permalink
Post by Arthur O'Dwyer
On Tue, Feb 27, 2018 at 7:27 AM, 'Edward Catmur' via ISO C++ Standard -
Actually, this raises an issue with your paper: you appear to have
left open whether values of the form 0x80... (such as
-9223372036854775808) are trap representations. Was this
intentional, or have I missed something?
I am not sure what JF's intention is, but my guess is that he intends
integral types to have no trap representations at all.
Certainly, my own intention is for integral types to have no trap
representations at all. However, I admit that I tried and failed to
figure out where that wording lives.
What chain of reasoning in the present-day C++ standard do you think
permits implementations to make 0x80... a trap representation?  If you
can point to a surviving loophole, I will do my best to close it up.
https://quuxplusone.github.io/draft/twosc-conservative.html#word
As I understand it, it follows from [basic.types]/4, which explicitly
separates object representation and value representation; the bits that
are part of object representation but not value representation are
called padding bits.

Further, [basic.fundamental] does not require value and object
representation to match for integral types, except narrow character
types. [basic.fundamental]/3 refers to the C standard section 5.2.4.2.1,
which defines integral limits and then refers to section 6.2.6 that
defines representation of types. 6.2.6.2 defines integer types and
paragraph 2 explicitly mentions padding bits.

Regarding trap bits, I didn't see a clause that explicitly allows them
to be in the object representation of integral types, but I didn't see
one that prohibits them. But there is `std::numeric_limits<T>::traps`
member ([numeric.limits.members]/64), which is said to be meaningful for
all types, including integers.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/cb35d432-80b0-febf-6796-dcbd926ce9a1%40gmail.com.
Andrey Semashev
2018-02-27 18:36:10 UTC
Permalink
On 02/27/18 18:27, 'Edward Catmur' via ISO C++ Standard - Future
On Tue, Feb 27, 2018 at 12:07 PM, Andrey Semashev
No, the roundtrip guarantee does not provide the guarantee that the
resulting signed value will be -9223372036854775808. In other words,
  unsigned long long x1 = 9223372036854775808;
  signed long long x2 = x1; // x2 value is unknown. trap, possibly?
  unsigned long long x3 = x2; // x3 == x1
What I really want, and what the original example requires, is that
x2 is guaranteed to have value -9223372036854775808 and nothing else.
If x3 == x1, then x2 must have had the value -9223372036854775808 and it
cannot have been a trap representation. There is no other value in
signed 64-bit that converts to 9223372036854775808.
I don't think so. For example, the implementation could produce a trap
value that it is able to convert back to 9223372036854775808 unsigned
but using it in any other context would result in a trap.
Actually, this raises an issue with your paper: you appear to have left
open whether values of the form 0x80... (such as -9223372036854775808)
are trap representations. Was this intentional, or have I missed something?
I have not presented a paper in this thread. The two mentioned papers
were from Arthur and JF Bastien. But as I understand Arthur's proposal,
it doesn't prohibit trap representations or padding bits. Let Arthur
correct me if I'm wrong.
Compiler intrinsics or inline assembler are not portable. I want
this ability in portable legal C++.
The intrinsics are available on every actively maintained compiler, and
result in clearer and more efficient code. There is little point in
expanding the space of strictly conforming programs if they are slow and
abstruse. It would be far better to standardize the intrinsics.
The only relevant intrinsics I'm aware of are gcc's overflow
intrinsics[1], which appeared in gcc 7, IIRC. This is one compiler, a
very recent version, too. Also, note that gcc supports only two's
complement signed integers.

As for standardizing intrinsics, or rather function wrappers, I don't
think this is the right interface for such functionality.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
As long as the conversion is implementation-defined, the tests would
only verify that the implementation is doing what we expect. While
that sort of protects from unexpected results (as long as the tests
are run by the users), that is not what I would like to have as a
library developer. I want to be able to say to my users what will be
the result of operations of my library and I can't do that if I use
implementation-defined or undefined behavior in my implementation.
You absolutely can; you either test at compile time that the
implementation-defined behavior enables your approach, or you test that
the implementation is one documented to behave as you require.
As a library author, I cannot reasonably inspect every compiler's
documentation to find out if it does something fancy on integer
conversion. And I shouldn't have to. We have the standard to tell us
what should happen and currently it tells us anything could happen, as
long as it is documented.
Or you ask your users to do so.
In other words, shift responsibility further down the line.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/0cb74826-7953-5300-80ef-da92183fb15d%40gmail.com.
Richard Smith
2018-02-28 18:28:47 UTC
Permalink
On 02/27/18 18:27, 'Edward Catmur' via ISO C++ Standard - Future Proposals
On Tue, Feb 27, 2018 at 12:07 PM, Andrey Semashev <
No, the roundtrip guarantee does not provide the guarantee that the
resulting signed value will be -9223372036854775808. In other words,
unsigned long long x1 = 9223372036854775808;
signed long long x2 = x1; // x2 value is unknown. trap, possibly?
unsigned long long x3 = x2; // x3 == x1
What I really want, and what the original example requires, is that
x2 is guaranteed to have value -9223372036854775808 and nothing else.
If x3 == x1, then x2 must have had the value -9223372036854775808 and it
cannot have been a trap representation. There is no other value in signed
64-bit that converts to 9223372036854775808.
I don't think so. For example, the implementation could produce a trap
value that it is able to convert back to 9223372036854775808 unsigned but
using it in any other context would result in a trap.
Actually, this raises an issue with your paper: you appear to have left
open whether values of the form 0x80... (such as -9223372036854775808) are
trap representations. Was this intentional, or have I missed something?
I have not presented a paper in this thread. The two mentioned papers were
from Arthur and JF Bastien. But as I understand Arthur's proposal, it
doesn't prohibit trap representations or padding bits. Let Arthur correct
me if I'm wrong.
Compiler intrinsics or inline assembler are not portable. I want
this ability in portable legal C++.
The intrinsics are available on every actively maintained compiler, and
result in clearer and more efficient code. There is little point in
expanding the space of strictly conforming programs if they are slow and
abstruse. It would be far better to standardize the intrinsics.
The only relevant intrinsics I'm aware of are gcc's overflow
intrinsics[1], which appeared in gcc 7, IIRC. This is one compiler, a very
recent version, too.
Actually, these builtins were added in Clang 3.4 (and GCC eventually
implemented compatible builtins); they've been in a production compiler for
several years, and are now available in Clang, in GCC, and in EDG's
Clang-compatible mode (though they're not yet enabled in its GCC-compatible
mode as far as I can tell).
Also, note that gcc supports only two's complement signed integers.
As for standardizing intrinsics, or rather function wrappers, I don't
think this is the right interface for such functionality.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
As long as the conversion is implementation-defined, the tests would
only verify that the implementation is doing what we expect. While
that sort of protects from unexpected results (as long as the tests
are run by the users), that is not what I would like to have as a
library developer. I want to be able to say to my users what will be
the result of operations of my library and I can't do that if I use
implementation-defined or undefined behavior in my implementation.
You absolutely can; you either test at compile time that the
implementation-defined behavior enables your approach, or you test that the
implementation is one documented to behave as you require.
As a library author, I cannot reasonably inspect every compiler's
documentation to find out if it does something fancy on integer conversion.
And I shouldn't have to. We have the standard to tell us what should happen
and currently it tells us anything could happen, as long as it is
documented.
Or you ask your users to do so.
In other words, shift responsibility further down the line.
--
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an
To view this discussion on the web visit https://groups.google.com/a/is
ocpp.org/d/msgid/std-proposals/0cb74826-7953-5300-80ef-
da92183fb15d%40gmail.com.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOfiQqm-7LiwHaHhW47EW%3DV53C%2BOnDXSF54xGqG%2BRxzxSkxczg%40mail.gmail.com.
Myriachan
2018-03-01 03:29:22 UTC
Permalink
I haven't had time to review this whole thread, but I saw an error on this
page:

https://quuxplusone.github.io/draft/twosc-conservative.html#intro

"1 << 31 Undefined behavior"

1 << 31 is already well-defined as INT_MIN on two's-complement 32-bit int
systems according to the current Standard:

"Otherwise, if E1 has a signed type and non-negative value, and E1 * 2^E2
is representable in the corresponding unsigned type of the result type,
then that value, converted to the result type, is the resulting value;
otherwise, the behavior is undefined."

1 << 31 is representable in unsigned int on such a system. Thus, it is
already legal to shift a 1 bit into the sign bit when the input was
positive, but you can't go past the sign bit. 1 << 31 == INT_MIN, but 2 <<
31 is undefined behavior.

constexpr int x = 1 << 31; // OK
constexpr int x = 2 << 31; // ill-formed: undefined behavior means
expression is not constant

Melissa
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/d2233fa4-6728-4aac-8080-a3c988dbc49a%40isocpp.org.
Arthur O'Dwyer
2018-03-01 08:53:11 UTC
Permalink
Post by Myriachan
I haven't had time to review this whole thread, but I saw an error on this
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
"1 << 31 Undefined behavior"
1 << 31 is already well-defined as INT_MIN on two's-complement 32-bit int
"Otherwise, if E1 has a signed type and non-negative value, and E1 * 2^E2
is representable *in the corresponding unsigned type* of the result type,
then that value, converted to the result type, is the resulting value;
otherwise, the behavior is undefined."
Thanks! I have now updated the table to include (2 << 31) as UB, and
changed (1 << 31) to "Implementation-defined value."

I don't think you can safely say that the particular implementation-defined
value in this case *must* be INT_MIN, even on a two's-complement system. It
must be (int)0x80000000u for sure, but the value of that expression is
still implementation-defined, and I don't think there's any hard
requirement that it be defined to exactly INT_MIN on any particular
system. The relevant wording AFAIK is [conv.integral].

Thanks for the correction! This stuff is pretty hairy.
–Arthur
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0Kj%2BqqFfMkxQZkZwoKiTG9Gwx%2B087_hNP0mNovzdB5N8Q%40mail.gmail.com.
Andrey Semashev
2018-03-01 09:46:22 UTC
Permalink
Post by Myriachan
I haven't had time to review this whole thread, but I saw an error on
https://quuxplusone.github.io/draft/twosc-conservative.html#intro
"1 << 31     Undefined behavior"
1 << 31 is already well-defined as INT_MIN on two's-complement 32-bit
"Otherwise, if E1 has a signed type and non-negative value, and E1 *
2^E2 is representable in the corresponding unsigned type of the result
type, then that value, converted to the result type, is the resulting
value; otherwise, the behavior is undefined."
1 << 31 is representable in unsigned int on such a system.  Thus, it is
already legal to shift a 1 bit into the sign bit when the input was
positive, but you can't go past the sign bit.  1 << 31 == INT_MIN, but 2
<< 31 is undefined behavior.
1 << 31 is implementation-defined then because its result, the value of
2147483648, is not representable as a 32-bit signed int.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/dd95aa3d-a9ea-fa44-05d1-c564ae04194a%40gmail.com.
Richard Smith
2018-02-28 18:07:42 UTC
Permalink
Post by JF Bastien
I’ve asked Chandler but I’ll ask you as well: data would be great in
getting committee consensus. To me “false positive” sounds an awful lot
like “not a bug” in this context :-)
I don't understand this line of reasoning. If you consider *unsigned*
wrapping detection to be a good thing in the sanitizer (I assume that is
what you mean by "not a bug"), why on earth do you want *signed* overflow
to wrap? Do you expect sanitizers to detect this? If so, why make it
legitimate behavior?

Without wrapping, I don't see enough motivation to change the status quo,
especially since it needlessly introduces an incompatibility with C. And I
certainly don't see how this is in the "top 20" things for C++20.


IMO, if the sanitizers are giving false positives, they should remove the
misleading bit about no false positives as a goal in the documentation <
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.
html#issue-suppression>.


This sanitizer is not part of ubsan, so that comment does not apply to it.
(For example, it is not enabled by -fsanitize=undefined.)

If that attribute suppresses -fsanitize=unsigned-integer-overflow, I'd
consider that a bug. We should have a different suppression mechanism for
sanitizer complaints on code with defined behaviour.
--
Nevin ":-)" Liber <mailto:***@eviloverlord.com> +1-847-691-1404
<(847)%20691-1404>
--
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/
isocpp.org/d/msgid/std-proposals/CAGg_6%2BOp%3DC0FtNeJ4MY7NTKZBiYP76U-
HCyp3%3DGYhYf11HHtvQ%40mail.gmail.com
<https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGg_6%2BOp%3DC0FtNeJ4MY7NTKZBiYP76U-HCyp3%3DGYhYf11HHtvQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOfiQqng-wV5bzYJqvhLOMbWRe2G989YmuDF06D4VY7OKA%2BfWQ%40mail.gmail.com.
JF Bastien
2018-02-26 17:37:36 UTC
Permalink
Post by JF Bastien
Post by JF Bastien
Hi Arthur,
I’ll be in JAX and will definitely champion my own approach, but will make sure any other approaches are discussed thoroughly. I’ll then ask for direction polls from EWG, and will follow the proposed direction in an updated paper.
And if you manage to write the updated paper overnight and have it back on the table the next day at JAX, then my paper will be utterly superfluous. :)
I am, however, worried that the writing of a new paper might slip more than a day, which would end up with you coming back in the pre-Rapperswil mailing with another two's-complement paper after the perception that your first two's-complement paper was "rejected" in Jacksonville, which would set a perceived negative precedent in people's minds.
I’ve frequently presented updated papers after obtaining feedback: evenings are for paper writing.
And if you manage to write the updated paper overnight and have it back on the table the next day at JAX, then my paper will be utterly superfluous. :)
Write your paper if you want, I just don’t see a point to it. Editing the paper is trivial, your paper just doesn’t help at all.
Post by JF Bastien
Post by JF Bastien
- Unintentional unsigned wraparound (for example, in the argument to `malloc`) has been a known source of bugs for a long time. See for example [Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>, whose final sentence is, "Our results also imply that tools for detecting integer numerical errors need to distinguish intentional from unintentional uses of wraparound operations — a challenging task — in order to minimize false alarms. [emphasis added]" The current undefinedness of signed overflow permits implementations, such as UBSan, to detect all signed wraparound behavior as unintentional by definition, and diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB because of the 3 representation, not to catch bugs, otherwise unsigned overflow would also have been UB. [...] I claim that this is also an emergent feature, not by design, caused by the 3 signed integer representations.
I think you are right in the historical sense. These days the emergent "catch bugs" rationale has survived, though, even as the exotic hardwares have died out.
Sure, I just think it’s unprincipled in where it’s applied because it’s only applied through historical accidents. That’s highly non-intuitive: "where can I catch bugs? Where there used to be hardware variance!” How do we expect newcomers to learn C++? We can’t just say “C++ offers a direct map to hardware”, and “C++ leaves no room for a language between itself and hardware”, because in many cases it’s just not true. Two’s complement hardware just wraps signed integers.

I’d much rather have a principled approach, say a library integer which can wrap / trap / saturate / UB. There’s a proposal for saturation, and I think the committee will take my paper and offer guidance on what a principled approach will be.
Post by JF Bastien
-fsanitize=unsigned-integer-overflow: Unsigned integer overflows. Note that unlike signed integer overflow, unsigned integer is not undefined behavior. However, while it has well-defined semantics, it is often unintentional, so UBSan offers to catch it.
This is very cool; I was unaware of this.
Your paper would benefit from mentioning this. But the obvious comeback is: where's the numbers on how many false positives UBSan generates in this mode? That number cannot possibly be zero.
Post by JF Bastien
- The impossibility of signed wraparound allows optimization of tight inner loops such as
for (int i = a; i != b; ++i)
Here the compiler is allowed to assume that `a <= b`, because if `b < a` the loop would eventually overflow and invoke undefined behavior.
I claim that this is also an emergent feature, not by design, caused by the 3 signed integer representations. I also claim that much of this performance can be regained with a better optimizer (the compiler I work on certainly optimizes loops substantially without assuming UB on overflow). Further, the Internet shows that this optimization isn’t something developers knowingly opt into, and when they hit it they are surprised by the bugs it generates.
Users never opt into bugs by definition. But users notice performance regressions (in the compiler) almost as quickly as they notice correctness regressions (in their own code).
File a bug if it’s slow. I sure do (or I go and fix the slowness).
Post by JF Bastien
Post by JF Bastien
This is intuitively the same behavior that we have with C++ iterators: the compiler is allowed to assume that the existence of a loop over the range `a` to `b` implies that `b` is actually reachable from `a` according to forward-increment semantics, even though in practice many implementations' std::list::iterator internally performs the equivalent of "wrap on overflow." (See the graphical diagram of "sentinel-node containers" in P0773 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0773r0.html#B> if needed.)
John McFarlane did a lightning talk about integer UB and codegen within the past year but I don't know if the slides are somewhere. I can ask him.
The intuition is reversed, though: C came before C++ and iterators.
Again, historically accurate but this is not modern C++'s problem. We don't teach C before C++. (And if we did, we might soon have to explain that integer overflow is undefined in C but not in C++? Add "C source-level compatibility" to the list of rationales for preserving C's undefined overflow behavior in C++.)
C won’t be source-level incompatible because if you mix C and C++ then you’re using hardware which has two’s complement. I’m also in separate talks to bring this up with WG14.
Post by JF Bastien
Post by JF Bastien
I mean, it's not like there's any shortage of educational material on UB in C and C++ and its good and bad effects.
Sure, do you believe there are particular references that should be read with my proposal?
Off the top of my head, I recall John McFarlane's lightning talk on codegen, Michael Spencer's "My Little Optimizer: Undefined Behavior Is Magic" talk, and pretty much anything involving John Regehr. I found the link to Regehr2012 as one of the top Google hits for "unintentional unsigned overflow”.
All good references. I happen to have worked with Michael on the blog post that predates his talk (reference
and helped review his talk. Michael and John both provided input to my paper before I published it (not to imply that they agree with all of the paper!). I’d like you to trust that I’ve done my homework before sending off the proposal. I purposefully kept the paper short-ish and only refer to standardese so that we can focus on what only allowing two’s complement means, and what I think we should do. I expect the committee to be well versed in C++, and I also expect very different opinions on overflow behavior. I just don’t know where exactly consensus will fall, we’ll see.
Post by JF Bastien
Post by JF Bastien
What there is a shortage of IMHO is material on ones'-complement in C and C++. That's why I kept large swaths of your paper intact in my fork. :)
Might the lack of such documentation be caused by a lack of ones’ complement hardware using modern C++?
Yes, that's what I intended to imply here. :)
Nobody teaches about the interaction of ones'-complement or sign-magnitude with code-generators anymore because these don't happen in practice.
People do teach about the interaction of undefined integer overflow with code-generators because this does happen in practice.
I’d rather teach that hardware wraps on signed overflow, just like unsigned, and unsurprisingly C++ does the same because it offers a pretty direct mapping to hardware.
Post by JF Bastien
Removing ones'-complement from C++ will be as painless (or painful) as removing trigraphs was.
Removing integer overflow from C++ will be as painful (or painless) as removing type-based alias analysis would be.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/EF365199-C04A-4E49-9C5B-378184B319B8%40apple.com.
John McFarlane
2018-02-27 20:58:34 UTC
Permalink
Post by Arthur O'Dwyer
These are true, but then the current undefined behavior on signed overflow
Post by Arthur O'Dwyer
- Unintentional unsigned wraparound (for example, in the argument to
`malloc`) has been a known source of bugs for a long time. See for example
[Regehr2012] <https://www.cs.utah.edu/~regehr/papers/overflow12.pdf>,
whose final sentence is, "Our results also imply that tools for detecting
integer numerical errors need to *distinguish intentional from
unintentional uses of wraparound operations* — a challenging task — in
order to minimize false alarms. [emphasis added]" The current
undefinedness of signed overflow permits implementations, such as UBSan, to
detect all signed wraparound behavior as unintentional by definition, and
diagnose it accordingly.
unsigned wraparound isn’t UB, and I claim that signed overflow is UB
because of the 3 representation, not to catch bugs, otherwise unsigned
overflow would also have been UB. [...] I claim that this is also an
emergent feature, not by design, caused by the 3 signed integer
representations.
I think you are right in the historical sense. These days the emergent
"catch bugs" rationale has survived, though, even as the exotic hardwares
have died out.
Sure, I just think it’s unprincipled in where it’s applied because it’s
"where can I catch bugs? Where there used to be hardware variance!” How do
we expect newcomers to learn C++? We can’t just say “C++ offers a direct
map to hardware”, and “C++ leaves no room for a language between itself and
hardware”, because in many cases it’s just not true. Two’s complement
hardware just wraps signed integers.
IIUC, two's complement hardware does not *just* wrap signed integers. It
also sets the carry flag so that multi-word integer arithmetic can be
performed efficiently. In other words, it does not wrap integers because
they are in any sense "circular" but because that is how you perform long
addition, long multiplication etc.. Hence the view that it's unsigned
integers which break a hardware-level abstraction and not signed integers
which impose a language-level abstraction.

I’d much rather have a principled approach, say a library integer which can
Post by Arthur O'Dwyer
wrap / trap / saturate / UB. There’s a proposal for saturation, and I think
the committee will take my paper and offer guidance on what a principled
approach will be.
+1
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CABPJVnRD-P0DvOWuxJUgxUYH5u99qpr2oyZ4ncfWNn-cJt%3Dw9w%40mail.gmail.com.
Myriachan
2018-03-01 03:32:24 UTC
Permalink
Post by JF Bastien
I’d much rather have a principled approach, say a library integer which
can wrap / trap / saturate / UB. There’s a proposal for saturation, and I
think the committee will take my paper and offer guidance on what a
principled approach will be.
On these lists before, I've called the modes "snap, trap and wrap". Just a
convenient English mnemonic for referring to this.

Melissa
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/db327cc6-a2db-4f72-a089-295b6d970f46%40isocpp.org.
o***@gmail.com
2018-02-26 10:49:33 UTC
Permalink
Post by JF Bastien
In the end I simply want two’s complement, and I see a few ways that this
play out with everyone liking the outcome. I just ask that opposition comes
with rationales, not “I like my way better”. :-)
Do you just want two's complement or do you also want defined behavior for
overflows of int etc?
Based on the Intro that's not clear to me.

For overflow checks, wouldn't functions that'd return whether an overflow
happened while doing the calculation be simpler?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/6c7ff8a3-2ec3-4856-9f35-92924d22d6f8%40isocpp.org.
JF Bastien
2018-02-26 17:14:13 UTC
Permalink
Post by JF Bastien
In the end I simply want two’s complement, and I see a few ways that this play out with everyone liking the outcome. I just ask that opposition comes with rationales, not “I like my way better”. :-)
Do you just want two's complement or do you also want defined behavior for overflows of int etc?
Based on the Intro that's not clear to me.
I think defined overflow behavior is an obvious fallout of two’s complement because that’s how hardware works and C++ describes itself as offering “a direct map to hardware". I think default overflow behavior should be wrap, and we should have library or operator support for UB / trap / saturation. Others disagree on what should be the terse syntax. I’m fine with another solution which allows developers to write code that overflows correctly without relying on non-obvious code patterns (such as casting to unsigned and back).
Post by JF Bastien
For overflow checks, wouldn't functions that'd return whether an overflow happened while doing the calculation be simpler?
We simply disagree on what the default behavior of + - * should be. One approach will have less syntax and the others more, I’m sure the committee will come to a decent consensus solution.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7DABAA6B-836C-4614-969B-7A16C4681977%40apple.com.
Thiago Macieira
2018-02-26 23:10:09 UTC
Permalink
Post by JF Bastien
Post by o***@gmail.com
For overflow checks, wouldn't functions that'd return whether an overflow
happened while doing the calculation be simpler?
We simply disagree on what the default behavior of + - * should be. One
approach will have less syntax and the others more, I’m sure the committee
will come to a decent consensus solution.
I'd argue that if you want to detect the overflow, a function like GCC/Clang
__builtin_add_overflow() is better. It should be able to calculate the result
and tell you whether that overflowed or underflowed. It's also harder to get
wrong or off-by-one errors (or off by 2 billion).

That has no impact on how overflow without detection should be done. I have my
opinion on that one, though.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8540355.rT7mrJT1r1%40tjmaciei-mobl1.
Zhihao Yuan
2018-02-26 23:26:05 UTC
Permalink
Post by Thiago Macieira
I'd argue that if you want to detect the overflow, a function like GCC/Clang
__builtin_add_overflow() is better. It should be able to calculate the result
and tell you whether that overflowed or underflowed. It's also harder to get
wrong or off-by-one errors (or off by 2 billion).
See also http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0103r1.html#numbers.overarith

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
_______________________________________________
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/LkdY9rbhYcRD86M428_I-ES5fbzPZdzNljXHrfy6-AbzUsoIHsPRAIF4RBPlw72_uEl7Y3VoZ5AYQ-ECZFiQo2c26AftZOe8eL4skCOAOUA%3D%40miator.net.
Thiago Macieira
2018-02-27 00:13:38 UTC
Permalink
Post by Zhihao Yuan
See also
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0103r1.html#number
s.overarith
Instead of "If there is no overflow, writes the computed value to *result.", I
suggest that the overflown and truncated value be stored in *result if it did
overflow.

By doing that, these functions could be used for overflow arithmetic when you
don't care or don't need to know if it did overflow.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1678303.bQV78sKWYA%40tjmaciei-mobl1.
Valentin Nechayev
2018-10-08 06:07:09 UTC
Permalink
Post by Arthur O'Dwyer
P0907R0 "Signed Integers are Two's Complement"
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r0.html>
My understanding is that JF wants to use this "modest proposal" (for
wrapping arithmetic on `int`, among other things) as a way to incite
discussion among the Committee.
It currently looks obvious that neither proposition to reduce C++ area to
twoÊŒs-complement machines nor one to fix each integer arithmetic operation
to wrapping mode will succeed. The former looks simply having no
strong-grounded aim and the latter drastically reduces optimization
opportinuties.

On the other hand, we have good example of local initiative - I mean GCC
overflow builtins. Their set is incomplete (at least, narrowing shall be
added, and shifts are desired) but exposes the proper trend.

Amongst the current proposals, P0103r1 looks the closest to one can
succeed, if reduced to minimal required extension set. This includes
overflow_{add,sub,mul,cvt}. But the principal thing lacked in P0103r1 is
that, if overflow detected, these functions shall still update result value
with lowest bits of result, the widest that fits into result type. The
advantage of such approach is shown in GCC: one will get the truncated
(wrapped) value for the only really present case (twoÊŒs-complement), and
useful guess of real value for other representations. So, adding such
requirement is really desired. From compilersʌ side, this would require
minimal efforts but provide basis for any future improvements and,
currently, a cumbersome but accessible technique to do the needed things
just now.

Another thing that is needed is compile-time provision of detection of
signed number representation used at particular platform. Despite we have
twoÊŒs-complement in 100% of cases, this is needed to conform to standard
flexibility. This is to be provided using preprocessor macros and useful
also for C code.

My proposal concept is here:
<https://segfault.kiev.ua/~netch/proposal_cpp_int_arith.html>. It carries
wording proposal for level 1; ones for higher levels are planned later on.
Its "level 1" is really a subset of P0103r1 (but planned independently,
looking at GCC overflow builtins, and with requirement to save value).
Higher levels then could be discussed later on. Level 2 is easy to
implement just in library header, provided level 1 is available.
Post by Arthur O'Dwyer
I have created an as-yet-unofficial "conservative fork" of the proposal,
which removes the parts that I think are airballs, while leaving in much of
what I consider the good stuff — notably, making signed-to-unsigned and
unsigned-to-signed conversions well-defined in terms of two's complement
representations, and defining what happens when you bit-shift into or out
of the sign bit.
https://quuxplusone.github.io/draft/twosc-conservative.html
I hope that if the Committee asks JF to come back with a more conservative
proposal, the existence of my "conservative fork" will save time, possibly
even allow further discussion later in the week at JAX.
I personally will not be at JAX, though. JF, will you be? Could I count
on you to... not to "champion" my unsubmitted paper, of course, but just to
be aware of it in case something like it is asked for by the Committee? I
mean, the worst-case, which I would like to avoid, is that JF's paper is
rejected as too crazy and then the entire subject is tabled until
Rapperswil. I would like to see some concrete progress in this department
at JAX if humanly possible.
–Arthur
P.S. — Also, if anyone on std-proposals has objections to the specific
diffs in my conservative proposal, I would like to know about it. I
deliberately tried to remove any whiff of controversy from the diff. (This
is distinct from objecting to my presumptuousness or objecting to wasting
the Committee's time. ;))
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/96dfe6c2-d48d-4ea5-9610-eac2fb26d312%40isocpp.org.
JF Bastien
2018-10-08 16:56:31 UTC
Permalink
P0907R0 "Signed Integers are Two's Complement" <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r0.html>
My understanding is that JF wants to use this "modest proposal" (for wrapping arithmetic on `int`, among other things) as a way to incite discussion among the Committee.
It currently looks obvious that neither proposition to reduce C++ area to twoÊŒs-complement machines
How so? I see quite the opposite: I expect Jen’s re-worded paper P1236 to make it to C++20 in San Diego.
nor one to fix each integer arithmetic operation to wrapping mode will succeed.
That hasn’t been in papers after r0.
The former looks simply having no strong-grounded aim and the latter drastically reduces optimization opportinuties.
On the other hand, we have good example of local initiative - I mean GCC overflow builtins. Their set is incomplete (at least, narrowing shall be added, and shifts are desired) but exposes the proper trend.
Amongst the current proposals, P0103r1 looks the closest to one can succeed, if reduced to minimal required extension set. This includes overflow_{add,sub,mul,cvt}. But the principal thing lacked in P0103r1 is that, if overflow detected, these functions shall still update result value with lowest bits of result, the widest that fits into result type. The advantage of such approach is shown in GCC: one will get the truncated (wrapped) value for the only really present case (twoʌs-complement), and useful guess of real value for other representations. So, adding such requirement is really desired. From compilersʌ side, this would require minimal efforts but provide basis for any future improvements and, currently, a cumbersome but accessible technique to do the needed things just now.
Another thing that is needed is compile-time provision of detection of signed number representation used at particular platform. Despite we have twoÊŒs-complement in 100% of cases, this is needed to conform to standard flexibility. This is to be provided using preprocessor macros and useful also for C code.
My proposal concept is here: <https://segfault.kiev.ua/~netch/proposal_cpp_int_arith.html>. It carries wording proposal for level 1; ones for higher levels are planned later on. Its "level 1" is really a subset of P0103r1 (but planned independently, looking at GCC overflow builtins, and with requirement to save value). Higher levels then could be discussed later on. Level 2 is easy to implement just in library header, provided level 1 is available.
I have created an as-yet-unofficial "conservative fork" of the proposal, which removes the parts that I think are airballs, while leaving in much of what I consider the good stuff — notably, making signed-to-unsigned and unsigned-to-signed conversions well-defined in terms of two's complement representations, and defining what happens when you bit-shift into or out of the sign bit.
https://quuxplusone.github.io/draft/twosc-conservative.html <https://quuxplusone.github.io/draft/twosc-conservative.html>
I hope that if the Committee asks JF to come back with a more conservative proposal, the existence of my "conservative fork" will save time, possibly even allow further discussion later in the week at JAX.
I personally will not be at JAX, though. JF, will you be? Could I count on you to... not to "champion" my unsubmitted paper, of course, but just to be aware of it in case something like it is asked for by the Committee? I mean, the worst-case, which I would like to avoid, is that JF's paper is rejected as too crazy and then the entire subject is tabled until Rapperswil. I would like to see some concrete progress in this department at JAX if humanly possible.
–Arthur
P.S. — Also, if anyone on std-proposals has objections to the specific diffs in my conservative proposal, I would like to know about it. I deliberately tried to remove any whiff of controversy from the diff. (This is distinct from objecting to my presumptuousness or objecting to wasting the Committee's time. ;))
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1E3E0505-D7DA-471D-85E1-170EDD13B659%40apple.com.
Loading...