Discussion:
Resumable expressions p0114r0 vs async/await P0057R0
(too old to reply)
Germán Diago
2015-10-03 11:41:18 UTC
Permalink
Hello everyone,

I do not mean to start a flame here, but I am still wondering why the
coroutines from
P0057R0 are still being considered.

For what it is worth, I find the paper from Christopher Kohlhoff very
clarifying, very well
reasoned, and providing alternatives for all the important use cases from
P0057R0 with
superior implementations.

I still share the same concerns as before for P0057R0, mainly:

- mandatory type erasure.
- as Christopher mentions, embedding a scheduler into the language is not a
nice thing.
- viral await is also something to be aware of.


On top of that, he shows alternatives for implementations:

1. Generators (reified and type-erased).
2. await.



You also have yield as an object, which I think can be of advantage in many
situations.

But my real question is, since I am not an expert:

1. There is something that can be done in P0057R0 that simply cannot be
done by resumable expressions + reasonable library support?

For me await/async + embedded scheduler is something like getting married
to an implementation detail that
a run-time must support. The mandatory type-erasure is not nice, compared
to being able to generate
what you would write by hand, a function object, which is what resumable
expressions do.

Am I missing anything here? As I say, my knowledge is quite limited in this
area.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-03 13:57:18 UTC
Permalink
Post by Germán Diago
Hello everyone,
I do not mean to start a flame here, but I am still wondering why the
coroutines from
P0057R0 are still being considered.
For what it is worth, I find the paper from Christopher Kohlhoff very
clarifying, very well
reasoned, and providing alternatives for all the important use cases from
P0057R0 with
superior implementations.
I am in no way deeply familiar with either of these two proposals. However,
after skimming P0114, one thing seems very clear: P0057 is *much* farther
along, in terms of actually creating an implementable standard.

The P0057 paper itself is actual wording, ready to be incorporated into the
standard. Not only that, P0057 has actual, *live* implementation experience
behind it. You can go get VS2015 right now and play with their
implementation of a version of this functionality
<https://paoloseverini.wordpress.com/2015/03/06/stackless-coroutines-with-vs2015/>
.

P0114 seems more... experimental. It sounds like something that has been
discussed to some degree, but is as of yet lacking a proof-of-concept
implementation. A lot is said about how it would be "possible" to implement
some particular facet under their new rules. But the paper never claims
that they've taken Clang or GCC or whatever and actually implemented it.

That's not to say that P0114 is dead and all effort should be focused on
P0057. But however much you may find P0114 to be technically superior,
P0057 has *earned* the right to be considered.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-04 03:12:07 UTC
Permalink
Post by Nicol Bolas
P0114 seems more... experimental. It sounds like something that has been
discussed to some degree, but is as of yet lacking a proof-of-concept
implementation. A lot is said about how it would be "possible" to implement
some particular facet under their new rules. But the paper never claims
that they've taken Clang or GCC or whatever and actually implemented it.
There is an experimental implementation.
That's not to say that P0114 is dead and all effort should be focused on
P0057. But however much you may find P0114 to be technically superior,
P0057 has *earned* the right to be considered.
Well, to me P0057 violates the *zero-overhead principle* that can be
avoided by the other proposal, in my humble opinion. You do need boxing.
Besides that, it has other disadvantages, and I see a bit of a mistake,
again, in my opinion, to embed a scheduler into the language, when you
could do it in a library, as Christopher's paper shows.

But I am not in the commitee or proposed anything. It just seems to me that
Christopher's proposal is more lightweight and can do everything
that P0057 can do better.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-04 04:44:48 UTC
Permalink
I see a bit of a mistake, again, in my opinion, to embed a scheduler into
the language, when you could do it in a library, as Christopher's paper
shows.
There is absolutely no embedded scheduler in P0057 and never was. P0057 and
its predecessors provide syntactic sugar for common async and sync patterns
and it is up to the library to decide what meaning to imbue the coroutine
with.

I suggest to look at this presentation:

http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf

which walks through some of the aspects of P0057 proposal. Note, that the
await syntax is actually quite old. It first appeared as do-notation in
Haskell in 1998 and you may notice that P0057 can be used to perform more
general "monadic" transformations and not only limited to coroutines.

Another thing that the presentation above highlights is that the
abstraction proposed is unique as it is not just zero-overhead. It is
negative overhead :-) . Meaning that for some problems, taking the
well-written code that uses functions / callbacks and rewriting it using
higher level abstractions, namely, the coroutines as proposed by PP0057
will result in simpler implementation, smaller object size and faster
execution.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-04 11:14:05 UTC
Permalink
I see a bit of a mistake, again, in my opinion, to embed a scheduler into
the language, when you could do it in a library, as Christopher's paper
shows.
There is absolutely no embedded scheduler in P0057 and never was.
Hello Gor. If there is no scheduler, I do not understand how await can
work. Forgive my ignorance, as I said above, I do not know to detail. But
my understanding is that if you have a call to await, that state for the
suspended coroutine must be kept somewhere. Where? I understand that this
state must live somewhere. Where is that state held?
P0057 and its predecessors provide syntactic sugar for common async and
sync patterns and it is up to the library to decide what meaning to imbue
the coroutine with.
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
Another thing that the presentation above highlights is that the
abstraction proposed is unique as it is not just zero-overhead. It is
negative overhead :-) . Meaning that for some problems, taking the
well-written code that uses functions / callbacks and rewriting it using
higher level abstractions, namely, the coroutines as proposed by PP0057
will result in simpler implementation, smaller object size and faster
execution.
I do not get yet how it can achieve this negative overhead. Even the
coroutines are type erased, as mentioned by Chris' papers. What can be
better than having inlinable, reified coroutines? I just do not get it.

I have three questions here:

1. How is the negative overhead achieved?
2. This would have negative overhead *compared* to an implementation with
resumable expressions?
3. Do these optimizations are fancy? We have had good inliners for years,
but it seems the coroutines from P0057 mandate
type erasure.

Sorry if I make any mistakes during my explanation, I am not an expert on
this papers, I just happen to understand quite well
Christopher's metaphor of function objects and I see very difficult
something more performant that non-type erased coroutines
that only take the space strictly required.


Regards
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-04 11:16:19 UTC
Permalink
Oh my english! I write too fast:

3. Do these optimizations are fancy? ----> Are these optimizations fancy?
Post by g***@hubblehome.com
I see a bit of a mistake, again, in my opinion, to embed a scheduler into
the language, when you could do it in a library, as Christopher's paper
shows.
There is absolutely no embedded scheduler in P0057 and never was.
Hello Gor. If there is no scheduler, I do not understand how await can
work. Forgive my ignorance, as I said above, I do not know to detail. But
my understanding is that if you have a call to await, that state for the
suspended coroutine must be kept somewhere. Where? I understand that this
state must live somewhere. Where is that state held?
P0057 and its predecessors provide syntactic sugar for common async and
sync patterns and it is up to the library to decide what meaning to imbue
the coroutine with.
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
Another thing that the presentation above highlights is that the
abstraction proposed is unique as it is not just zero-overhead. It is
negative overhead :-) . Meaning that for some problems, taking the
well-written code that uses functions / callbacks and rewriting it using
higher level abstractions, namely, the coroutines as proposed by PP0057
will result in simpler implementation, smaller object size and faster
execution.
I do not get yet how it can achieve this negative overhead. Even the
coroutines are type erased, as mentioned by Chris' papers. What can be
better than having inlinable, reified coroutines? I just do not get it.
1. How is the negative overhead achieved?
2. This would have negative overhead *compared* to an implementation with
resumable expressions?
3. Do these optimizations are fancy? We have had good inliners for years,
but it seems the coroutines from P0057 mandate
type erasure.
Sorry if I make any mistakes during my explanation, I am not an expert on
this papers, I just happen to understand quite well
Christopher's metaphor of function objects and I see very difficult
something more performant that non-type erased coroutines
that only take the space strictly required.
Regards
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-05 14:42:04 UTC
Permalink
Post by g***@hubblehome.com
I see a bit of a mistake, again, in my opinion, to embed a scheduler into
the language, when you could do it in a library, as Christopher's paper
shows.
There is absolutely no embedded scheduler in P0057 and never was.
Hello Gor. If there is no scheduler, I do not understand how await can
work. Forgive my ignorance, as I said above, I do not know to detail. But
my understanding is that if you have a call to await, that state for the
suspended coroutine must be kept somewhere. Where? I understand that this
state must live somewhere. Where is that state held?
After reading the proposal a bit, it's clear that `await` does not actually
"wait" on anything. For the most part, it's a syntactic transformation on
an expression.

The expression that `await` applies to must result in an object that has a
certain interface. And the logic for `await` calls that interface. If there
is any scheduling logic going on, it is in the implementation of that
interface, not in `await` itself.

As such, the storage in question is in the object resulting from the
`await` expression.

And the closest to scheduling that `await` gets is the decision to check if
the value is ready before yielding.

Sorry if I make any mistakes during my explanation, I am not an expert on
Post by g***@hubblehome.com
this papers, I just happen to understand quite well
Christopher's metaphor of function objects and I see very difficult
something more performant that non-type erased coroutines
that only take the space strictly required.
How much more performant? Is it enough to be worth arguing about? After
all, most things you'll be using await for won't be cheap operations. Will
you actually *notice* any such performance loss?

That's not to say that I much *like* resumable functions as a proposal. I
can't say I look forward to doing a bunch of `await`ing and using
specialized return values just to be able to allow some deeply nested
function perform a `yield` back to the original caller. And that's not even
using threading.

That being said, I noticed one thing about resumable expressions that makes
it a complete deal-breaker for me:

When a resumable function is used in a resumable expression, the definition
Post by g***@hubblehome.com
of the function must appear before the end of the translation unit.
Um, no. I understand that this requirement is not necessarily recursive.
That is, you don't need the definition of every function the resumable one
calls. However, if the resumable one you call *itself* makes a resumable
call, you will need those definitions. And if they make resumable calls,
you'll need *those* definitions. And so forth.

If it's a choice between forbidding inlining and *forcing* inlining, I'll
accept the overhead of forbidding inlining.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-06 11:34:02 UTC
Permalink
How much more performant? Is it enough to be worth arguing about? After
Post by Nicol Bolas
all, most things you'll be using await for won't be cheap operations. Will
you actually *notice* any such performance loss?
Well this is a usual argument to use "productivity languages". As far as I
know, the definition of performance for C++ is that between C++ and machine
code, we can only choose assembly.
So far, it has been good to me. Boxing is bad, bad, bad. I do not think it
is a good idea in a language abstraction. About the scheduling, not sure,
but I believe what you say for now :).
Post by Nicol Bolas
When a resumable function is used in a resumable expression, the
definition of the function must appear before the end of the translation
unit.
There is an example of a boxed generator with separate compilation in the
paper. Doesn't that contradict whay you are claiming?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-06 14:05:11 UTC
Permalink
Post by Nicol Bolas
How much more performant? Is it enough to be worth arguing about? After
Post by Nicol Bolas
all, most things you'll be using await for won't be cheap operations. Will
you actually *notice* any such performance loss?
Well this is a usual argument to use "productivity languages". As far as I
know, the definition of performance for C++ is that between C++ and machine
code, we can only choose assembly.
Yeah, tell that to iostream.

While that might be a goal of C++, it's not an *overriding* goal. It
doesn't automatically pre-empt all other considerations.

So far, it has been good to me. Boxing is bad, bad, bad.
Why?

I do not think it is a good idea in a language abstraction. About the
Post by Nicol Bolas
scheduling, not sure, but I believe what you say for now :).
Post by Nicol Bolas
When a resumable function is used in a resumable expression, the
definition of the function must appear before the end of the translation
unit.
There is an example of a boxed generator with separate compilation in the
paper. Doesn't that contradict whay you are claiming?
So let me get this straight.

A resumable function, under this design, is inline. However, by making my
function *not* resumable, I can get the effect of a non-inline resumable
function by making the function body actually a lambda (which the compiler
will deduce is a resumable function), and returning it in some object. And
if I have allocation needs, I have to explicitly specify them in the body
of every function that has those needs. And so forth.

Meaning that, in a rather common case, the user has to do a lot of work.
Isn't the whole point of a compiler to do that sort of gruntwork *for you*?
Why not make `resumable` do this "boxing" work for you, and have `inline
resumable` do what the current thing suggests? After all, the current
`resumable` implies `inline`, so we'd just be making it explicit.

So `inline resumable` means to do what it currently says. And `resumable`
means to automatically do boxing and so forth. Or if you prefer `resumable`
to mean the current case, introduce another keyword to have the compiler
generate the "boxing" code for you.

Users should not have to do this nonsense manually, especially considering
how common such code will be.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-06 17:32:31 UTC
Permalink
Post by Nicol Bolas
Why not make `resumable` do this "boxing" work for you, and have `inline
resumable` do what the current thing suggests?
Because you can provide library solutions for the boxing in the proposals,
without embedding mandatory boxing into the feature.
Just provide a generator<int> and you are done. Nothing prevents you from
giving these library types, that can box, but
it does not *force* you from the beginning.

I cannot see how resumable expressions are worse than await when:

1. can still provide types for boxing on the lib side.
2. it can emulate async from a library.
3. no viral await when refactoring.
4. it does *not* mandate boxing.
5. it can embed member variables. Maybe even relax restrictions for
copy/move as needed in later proposals.
6. does not need a fancy escape analysis that await, as of now, needs, and
is a Microsoft-specific compiler optimization as of today.


Users should not have to do this nonsense manually, especially considering
Post by Nicol Bolas
how common such code will be.
Just providing a library type solves the full problem.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-06 20:00:28 UTC
Permalink
Post by Nicol Bolas
Why not make `resumable` do this "boxing" work for you, and have `inline
Post by Nicol Bolas
resumable` do what the current thing suggests?
Because you can provide library solutions for the boxing in the
proposals, without embedding mandatory boxing into the feature.
Germán:

*On Boilerplate:*

The starting point for my proposal was lambda-*. A lambda that keeps all of
the objects with automatic storage duration in the body in the lambda
function object. I wanted an abstraction that is more fundamental than what
was offered by earlier await proposals pre N4134 and awaits in other
languages, but the one, on top of which I can efficiently build C# like
await syntax. After running around with that idea for a month I came to the
conclusion that when applied to concrete problems, it requires more
boilerplate code and does not result in more efficient code. Moreover, in
those cases where you don't need to allocate your lambda* on the heap and
can put it on the stack, I could elide heap allocation in the optimizer for
N4134. Hence, I tabled lambda* until better times.

Chris proposal suffers from the same problem as lambda*. It requires you to
write more code as a user without providing tangible benefit. As I said
before, for any concrete problem, you will get the same or more efficient
code with my proposal than with Chris'es. Thus, there is no justification
for added complexity.

Look at the async state machine problem which I discussed in the
http://wg21.link/N4287. Or look at http://wg21.link/P0055 which explains
how await SomeAsyncOp is expanded. Now, compare what does it take to get
from await f() in my proposal and await(f()) or f(use_await) in *P0114R0*
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0114r0.pdf> to
the actual OS call. Abstraction overhead is lower in P0057.

*On Optimizations*

C++ is a language that offers an ability to create zero-overhead
abstractions (or negative overhead in case of P0057). However,
zero-overhead part comes from the optimizer. When STL was first proposed in
1994, no compiler in the world could make it efficient. It took more than
ten years before compilers caught up. Optimization technology is
fundamental to C++ abstractions.

*Await or Not*

If you look at http://wg21.link/P0054, you will find a section "Exploring
design space" which sketches out how you can evolve P0057 to add the
"magic" so that you don't have to write awaits. However, I am not sure that
absence of explicit indication of suspend points is a good thing, but, may
get convinced otherwise in the future.

Cheers,
Gor
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-07 04:00:34 UTC
Permalink
Post by Gor Nishanov
Post by Nicol Bolas
Why not make `resumable` do this "boxing" work for you, and have `inline
Post by Nicol Bolas
resumable` do what the current thing suggests?
Because you can provide library solutions for the boxing in the
proposals, without embedding mandatory boxing into the feature.
*On Boilerplate:*
Chris proposal suffers from the same problem as lambda*. It requires you
to write more code as a user without providing tangible benefit.
Well, I am not sure the boilerplate that you are talking about. Chris
proposal is more "low-level". But you can build on top of that everything
that can be done by await in a library, can't you? Also, implementing
resumable expressions in a compiler cannot be hard,
and you can reuse existing technology from optimizer: basically the
inliner.
In the C++ style of things, where we are starting to use things as
"Regular" objects and so on, I see the funcion object metaphor very clear.
But I also see some benefits, please, tell me how this is compared to your
proposal, because I understand Chris proposal better, so I could be wrong:

About the negative overhead: I saw your slides and it is impressive, I must
admit. But:

1. I do not understand how negative overhead is achieved.
2. That negative overhead *cannot* be achieved by Chris proposal, is it
something exclusive of your implementation?


In Chris proposal:

1. You can also implement await. It is a matter of providing a library
solution.
2. You can embed as member a resumable expression.
3. You do not need yield keyword, actually, yield is reified in an object,
and simple to implement. You could save also this state somewhere, as an
object. Can this be done
by await?
4. What is the space overhead of await? As far as I know, resumable
expressions just need the strictly necessary space. To eliminate this
overhead we need optimizations? Can be done?
5. Reified resumable expressions (no type erased) -- is this possible in
your proposal? I think it was mentioned before that the optimizer can
discover this and do inlining when needed, anyway?
Can also do escape analysis.
6. Type erased resumable expressions. This is an opt-in in Chris proposal.
It is a must in your proposal, right? You rely on the optimizer for
eliminating this overhead?




Look at the async state machine problem which I discussed in the
http://wg21.link/N4287. Or look at http://wg21.link/P0055 which explains
how await SomeAsyncOp is expanded. Now, compare what does it take to get
from await f() in my proposal and await(f()) or f(use_await) in *P0114R0*
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0114r0.pdf> to
the actual OS call. Abstraction overhead is lower in P0057.

*On Optimizations*

C++ is a language that offers an ability to create zero-overhead
abstractions (or negative overhead in case of P0057). However,
zero-overhead part comes from the optimizer. When STL was first proposed in
1994, no compiler in the world could make it efficient. It took more than
ten years before compilers caught up. Optimization technology is
fundamental to C++ abstractions.

*Await or Not*

If you look at http://wg21.link/P0054, you will find a section "Exploring
design space" which sketches out how you can evolve P0057 to add the
"magic" so that you don't have to write awaits. However, I am not sure that
absence of explicit indication of suspend points is a good thing, but, may
get convinced otherwise in the future.

Cheers,
Gor
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-07 17:31:48 UTC
Permalink
German:

If you noticed the theme of my answers to your questions was to move you
away from feature list style comparison and take a look at how it is
reduced to practice.

Alex Stepanov said many insightful things and one of them was: "I still
believe in abstraction, but now I know that one ends with abstraction, not
starts with it. I learned that one has to adapt abstractions to reality and
not the other way around." (
http://web.archive.org/web/20071120015600/http://www.research.att.com/~bs/hopl-almost-final.pdf page
18).

I suggested earlier to take some concrete example, such as tcp_reader.
Write it both ways and analyze from three angles.

1) how much code a user has to write to use this abstraction to solve the
problem
2) how much code a library/framework developer has to write to support this
abstraction
3) what is the abstraction penalty. How many (after optimization)
instructions need to get executed to get from the abstraction to the
hardware.

I believe approaching the comparisons in this way, will help you discover
the answers to the questions you seek.

*On hidden magic:*

Coroutine proposal is similar to a range-base-for. A compiler does the
syntactic sugar. The magic is an idea of iterable that allows a compiler to
communicate with the library.

Similarly, with coroutine proposal, the magic that gets you
zero/negative-overhead is in an idea of awaitable. You do the magic
yourself. You can find samples of "negative-overhead" awaitable in the
slides http://wg21.link/N4287. Also http://wg21.link/P0055 shows how
this magic can be extended via CompletionToken technique to any template
library that models their API after the networking library.

The transformation that compiler does is specified in
http://wg21.link/P0057.

Cheers,
Gor
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
germandiago@gmail.com
2015-10-09 05:49:44 UTC
Permalink
Post by Gor Nishanov
If you noticed the theme of my answers to your questions was to move you
away from feature list style comparison and take a look at how it is
reduced to practice.
Alex Stepanov said many insightful things and one of them was: "I still
believe in abstraction, but now I know that one ends with abstraction, not
starts with it. I learned that one has to adapt abstractions to reality and
not the other way around." (
http://web.archive.org/web/20071120015600/http://www.research.att.com/~bs/hopl-almost-final.pdf page
18).
Well, I agree that you accumulated a good deal of experience during the
implementation. Noone can negate that. And evidence shows that, for your
test cases,
the negative overhead seems impressive.
Post by Gor Nishanov
*On hidden magic:*
Coroutine proposal is similar to a range-base-for. A compiler does the
syntactic sugar. The magic is an idea of iterable that allows a compiler to
communicate with the library.
That is nice, because I really thought there was a scheduler embedded in
some way, but this stays on the lib side, right?
Post by Gor Nishanov
Similarly, with coroutine proposal, the magic that gets you
zero/negative-overhead is in an idea of awaitable. You do the magic
yourself. You can find samples of "negative-overhead" awaitable in the
slides http://wg21.link/N4287. Also http://wg21.link/P0055 shows how
this magic can be extended via CompletionToken technique to any template
library that models their API after the networking library.
The transformation that compiler does is specified in
Post by Gor Nishanov
http://wg21.link/P0057.
Given all this, I see your proposal as a nice candidate also, though you
already know my preference and why. Basically:

1. Resumable expressions do not need to be type erased, but can.
2. Resumable expression objects can be held as objects,, (even non-type
erased? I am not sure).

What are the chances that we could capture the coroutines themselves in
variables, and make them copyable and movable?
I tend to see as a standard idiom to have objects that can be
copied/moved/compared, etc. that is the trend lately I think.
I do not mean the rest is not good, but why should we prevent these
semantics in the first place in coroutines?


Also, I see the yield keyword. I am not sure how it works. That is on the
library side in Chris' proposal, represented as an object.
What are the differences?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-09 05:51:23 UTC
Permalink
Post by ***@gmail.com
1. Resumable expressions do not need to be type erased, but can.
2. Resumable expression objects can be held as objects,, (even non-type
erased? I am not sure).
Also:

3. Template code reuse also seems appealing to me.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-09 14:19:40 UTC
Permalink
Post by ***@gmail.com
Well, I agree that you accumulated a good deal of experience during the
implementation. Noone can negate that. And evidence shows that, for your
test cases,
the negative overhead seems impressive.
Perhaps I was too cryptic in my previous response. The point of Stepanov's
quote was that usefulness of an abstraction comes from how it helps to
solve a real problem. You start with the problem, you end with an
abstraction (See N4287 slide 7 for more). Thus, if you believe that a
particular aspect of resumable expressions is awesome. Take a real problem
(possibly reduce it to the size of tcp_reader) and code it up using
resumable expression syntax, than compare how the same problem can be
solved using coroutines. Evaluate it on the three criteria I listed: how
much end-user writes, how much library support required, what is the
abstraction penalty.
Post by ***@gmail.com
Also, I see the yield keyword. I am not sure how it works. That is on the
library side in Chris' proposal, represented as an object.
`yield expr' is syntactic sugar for 'await $p.yield_value(expr)' . See
p0057r0/*[expr.yield]*
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0057r0.pdf>
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-09 15:21:56 UTC
Permalink
Post by ***@gmail.com
What are the chances that we could capture the coroutines themselves in
variables, and make them copyable and movable?
I tend to see as a standard idiom to have objects that can be
copied/moved/compared, etc. that is the trend lately I think.
I do not mean the rest is not good, but why should we prevent these
semantics in the first place in coroutines?
If you're talking about value semantics, that doesn't make sense for
coroutines. Remember that part of a coroutine's state is the stack. And
stack variables are often references or pointers to other stack variables.
You cannot effectively copy such a construct. And it's silly for the user
to have to define a "copy constructor" for their call stack.

That's why `coroutine_handle` has reference semantics. It just makes more
sense for coroutines. That doesn't prevent you from being able to pass them
(and any containing object) around. You can even have a
`std::vector<generator<int>>` and resume each one in turn.

Or to put it another way, just because you see `await` used to catch the
coroutine promise returned by a coroutine does not mean you *have* to use
it that way.
Post by ***@gmail.com
Also, I see the yield keyword. I am not sure how it works.
It returns a value and suspends the coroutine's execution at that point.

If you're wondering about the details of how the value is passed to the
coroutine promise and all, that's part of the proposal.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-10 04:24:34 UTC
Permalink
Post by Nicol Bolas
Post by ***@gmail.com
What are the chances that we could capture the coroutines themselves in
variables, and make them copyable and movable?
I tend to see as a standard idiom to have objects that can be
copied/moved/compared, etc. that is the trend lately I think.
I do not mean the rest is not good, but why should we prevent these
semantics in the first place in coroutines?
If you're talking about value semantics, that doesn't make sense for
coroutines. Remember that part of a coroutine's state is the stack. And
stack variables are often references or pointers to other stack variables.
You cannot effectively copy such a construct. And it's silly for the user
to have to define a "copy constructor" for their call stack.
In some cases it makes sense, in others, it does not make sense. It all
depends, I think. Let me look for more seriuos use cases.
Post by Nicol Bolas
That's why `coroutine_handle` has reference semantics. It just makes more
sense for coroutines. That doesn't prevent you from being able to pass them
(and any containing object) around. You can even have a
`std::vector<generator<int>>` and resume each one in turn.
That you can store them is nice, indeed!
Post by Nicol Bolas
Or to put it another way, just because you see `await` used to catch the
coroutine promise returned by a coroutine does not mean you *have* to use
it that way.
I am not sure about this. I have to take a more serious look to both
proposals and do a comparison. I think a good starting point would be to
convert Gor's code
to resumable expressions, which is what I think is more low-level proposal,
and see how code looks.

Yesterday I was taking a look and I still have the impression that Gor's
proposal is not as minimal as it could be. At least it does not embed any
scheduler, that is true,
but I see that it "hardcodes" a protocol into the language that is bigger
than Chris'. But as I said, I need to take a more serious look at this to
make a really
fair and accurate comparison.
Post by Nicol Bolas
Post by ***@gmail.com
Also, I see the yield keyword. I am not sure how it works.
It returns a value and suspends the coroutine's execution at that point.
In resumable expressions, that can be done on top of a library abstraction,
why putting this into the language should be better?
Remember that when you put something into the language, there is no way
back. This is my main reasoning for making things
into a library when possible. Resumable expressions are minimal. Still,
hold on, I need a more serious look into this.
Post by Nicol Bolas
If you're wondering about the details of how the value is passed to the
coroutine promise and all, that's part of the proposal.
Well, at a first impression, you all know which proposal I would favour,
but I need to document myself further on this. I hope I can
give a more in-depth comparison. Though, cannot promise, a lot of work. :)

Regards
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-10 06:34:40 UTC
Permalink
Post by Germán Diago
Post by Nicol Bolas
Or to put it another way, just because you see `await` used to catch the
coroutine promise returned by a coroutine does not mean you *have* to
use it that way.
I am not sure about this. I have to take a more serious look to both
proposals and do a comparison. I think a good starting point would be to
convert Gor's code
to resumable expressions, which is what I think is more low-level
proposal, and see how code looks.
Yesterday I was taking a look and I still have the impression that Gor's
proposal is not as minimal as it could be. At least it does not embed any
scheduler, that is true,
but I see that it "hardcodes" a protocol into the language that is bigger
than Chris'. But as I said, I need to take a more serious look at this to
make a really
fair and accurate comparison.
I guess my question is this: why does it matter which is "lower level" than
the other?

A proposal should be as low level as it needs to be, and *no lower*. So,
given equal performance in similar situations (and Gor has provided
evidence that this is possible in at least some cases), the principle
difference maker should be actual functionality, not the level of
abstraction.

Indeed, I feel quite the opposite from you. So long as they provide
equivalent functionality with equivalent performance, the *higher* level
one should be considered better. `int[5]` is unquestionably lower level
than `array<int, 5>`, but we tell people to almost always use the latter
instead of the former. We do so because, though the latter is higher level,
it causes no loss of performance, and it improves safety and ease-of-use
over the former.

Also, I see the yield keyword. I am not sure how it works.
Post by Germán Diago
Post by Nicol Bolas
It returns a value and suspends the coroutine's execution at that point.
In resumable expressions, that can be done on top of a library
abstraction, why putting this into the language should be better?
The question could easily be turned around: why is putting it into the
library better? Because it's more "low-level", by some measurement? What
good does that do me as a user of the language?

After all, with the exception of `break`, there's nothing range-based for
can do that `for_each` cannot. Yet we thought that was important enough to
put into the language.

So I do not see why merely allowing a feature to be implemented in the
library rather than the language is a point in that version's favor. It's
interesting and useful to note, but it is not, by itself, an advantage.

Remember that when you put something into the language, there is no way
Post by Germán Diago
back.
I contest the idea that language features are more immutable than library
features. The standard library contains a lot of deficient elements, but I
don't see those being undone. We still have iostreams lying around, despite
wide-spread conventional wisdom saying not to use them. We still have STL
containers that don't erase the allocator's type, despite this being an
oft-requested feature.

Oh sure, we seem to be getting rid of `auto_ptr` and a couple of other
small things. But I don't see any evidence that library functionality is so
much more malleable than language features.

Mistakes will persist, no matter whether they are language or library
mistakes. So we shouldn't be that much more afraid of language screwups
than library ones. Indeed, the latter will be far more persistent, since
most people aren't going to be directly interfacing with the low-level.

Or to put it another way, however you choose to implement "yield", that is
what people are going to use. As a common coroutine feature, lots of code
will be written against it. And once implemented, you will be no more able
to correct flaws in a library `yield` function than in a language `yield`
expression.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-07 04:17:08 UTC
Permalink
Post by Gor Nishanov
*Await or Not*
If you look at http://wg21.link/P0054, you will find a section "Exploring
design space" which sketches out how you can evolve P0057 to add the
"magic" so that you don't have to write awaits. However, I am not sure that
absence of explicit indication of suspend points is a good thing, but, may
get convinced otherwise in the future.
Would this make other code reusable as Chris proposal? That would be a good
thing then. Because if you want to be explicit, you do not need to infect
things with await everywhere up the stack.
I favor this, sure.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-07 01:48:06 UTC
Permalink
Post by Nicol Bolas
Why not make `resumable` do this "boxing" work for you, and have `inline
Post by Nicol Bolas
resumable` do what the current thing suggests?
Because you can provide library solutions for the boxing in the
proposals, without embedding mandatory boxing into the feature.
Just provide a generator<int> and you are done. Nothing prevents you from
giving these library types, that can box, but
it does not *force* you from the beginning.
Perhaps you misunderstood what I meant when I was talking about "boxing".

I'm not talking about `generator<T>`. I'm talking about the lambda wrapper
part, with optional allocator and whatnot. Having to write `return
[=](){<actual function>};` around every resumable function I write is a
pain that I would rather not have to deal with.

There is absolutely no reason why that boilerplate can't be written by the
compiler.

I cannot see how resumable expressions are worse than await when:
"Worse" is ultimately a matter of opinion. Some restrictions will be
considered worse by some people than others.

However, I have to say that Gor Nishanov seems to be winning the argument
here, since P0057 is being implemented *efficiently*, with all the inlining
and other simplifications that you claimed was not possible. That was your
biggest argument against resumable functions, and it turned out to be wrong
in at least some of the cases. You can try to dismiss the fact that a good
optimizer made the code equivalent if you like, but that doesn't change the
fact that optimizer aren't getting worse over time. They're getting *better*
.

If there is no objective performance difference between them, then most of
your case just evaporated. So the only remaining function difference is
that P0114 requires explicit "boxing" if your code needs boxing.

If you're going to bring up having to use `await` frequently:

3. no viral await when refactoring.


You say that as though resumable expressions don't have their viral aspects
too. You can only call a resumable function as part of a resumable context:
either the calling function is resumable or the expression making that call
is resumable. Both of these require explicit annotation (except in those
cases where the compiler magically works it out for you... somehow).

Oh sure, you won't be using `resumable` like you would `await`, or quite as
much. But the fact is, you can't call a `resumable` function unless you've
typed `resumable` somewhere nearby. So you still have to annotate up your
call graph. And you still can't call coroutines of either kind without some
kind of annotation.

So I'm not seeing how that's a point in resumable expression's favor.

Oh, and I have to agree with P0054: I think I'd rather see awaiting and
yielding happen than for them to be implicit. Also, I seem to recall that
an explicit `await` was something that the `expected` guys wanted to be
able to key off of. So there is the power to be able to use the
functionality for other uses.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-07 04:09:28 UTC
Permalink
3. no viral await when refactoring.
Post by Nicol Bolas
You say that as though resumable expressions don't have their viral
aspects too. You can only call a resumable function as part of a resumable
context: either the calling function is resumable or the expression making
that call is resumable. Both of these require explicit annotation (except
in those cases where the compiler magically works it out for you...
somehow).
Oh sure, you won't be using `resumable` like you would `await`, or quite
as much. But the fact is, you can't call a `resumable` function unless
you've typed `resumable` somewhere nearby. So you still have to annotate up
your call graph. And you still can't call coroutines of either kind without
some kind of annotation.
I think that comparison is simply no honest: if you put await 7 levels down
the stack, you need to decorate all the way up with await. For resumable,
you would need to do it in the 7th level only, and you
would not need to refactor the rest of the code. That is 7 vs 1
refactoring. Needless to say the reusability problem that Chris exposes in
the paper: you cannot reuse algorithms, for example,
with await. You cannot have member variables with await either, right?
These are all tangible benefits from having a function object as a
representation.
Post by Nicol Bolas
So I'm not seeing how that's a point in resumable expression's favor.
You can see it: Imagine a deep stack of calls. How much refactoring do you
need in each of the proposals? Imagine code reuse: resumable expressions
can reuse code.
We cannot say the same about await *unless* I missed something. In the C++
style, I think resumable functions are more well behaved than await, in the
sense that
it is just a function object, you know what it is doing, you could make it
(maybe in future proposals) copyable, movable, you know the representation:
jump point + strictly needed data.
I think the resumable expressions proposal puts the bar very high to the
rest of the proposals, because besides its benefits, you can also implement
what other proposals are proposing.
Post by Nicol Bolas
Oh, and I have to agree with P0054: I think I'd rather see awaiting and
yielding happen than for them to be implicit.
For me, implicit yielding is precisely what a non-object based yield does.
In Chris proposal, a yielder is an object you could pass around.
On top of that you could make more abstractions. He got it right. He did
not hide anything that prevents any functionality.

I see Gor's proposal powerful also but I think it is hiding too much stuff,
but the point is not that actually, the point is that you can do
what that proposal does on top of resumable expressions, unless I am
misunderstanding something.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-07 14:39:06 UTC
Permalink
Post by Germán Diago
3. no viral await when refactoring.
Post by Nicol Bolas
You say that as though resumable expressions don't have their viral
aspects too. You can only call a resumable function as part of a resumable
context: either the calling function is resumable or the expression making
that call is resumable. Both of these require explicit annotation (except
in those cases where the compiler magically works it out for you...
somehow).
Oh sure, you won't be using `resumable` like you would `await`, or quite
as much. But the fact is, you can't call a `resumable` function unless
you've typed `resumable` somewhere nearby. So you still have to annotate up
your call graph. And you still can't call coroutines of either kind without
some kind of annotation.
I think that comparison is simply no honest: if you put await 7 levels
down the stack, you need to decorate all the way up with await.
The way `await` works is that it halts the current function and returns
control to the calling function in such a way that the calling function can
resume it later.

Therefore, the only reason you would need to "put await 7 levels down the
stack" is if you want every function in that call graph to halt when the
top-most function does, and thus return control to the caller "7 levels
down". Correct?
Post by Germán Diago
For resumable, you would need to do it in the 7th level only, and you
would not need to refactor the rest of the code.
That's not true.

A function marked `resumable` can *only be called* from a resumable
context. This is either another function marked `resumable` or from an
expression marked `resumable`.

So this is illegal:

resumable int level3()
{
return 5;
}

int level2()
{
return level3(); //Cannot call a resumeable function here.
}

Therefore, *every one of those 7 levels* is going to have to be a
`resumable` function. So every one of those levels will have to mark their
signature with `resumable`. Any code that calls into any one of those 7
levels will have to mark each use of them with `resumable`, or will
themselves have to be coroutines.

So yes, it's just as viral. Only it's worse, because not only do you have
to mark them `resumable`, they *must be inline*.

The only saving grace you get is that the proposal allows automatic
deduction of resumeable functions. But not everywhere; only in template
code and lambdas. So normal functions don't provide this feature.

That is 7 vs 1 refactoring. Needless to say the reusability problem that
Post by Germán Diago
Chris exposes in the paper: you cannot reuse algorithms, for example,
with await.
That helps demonstrate the viral nature of resumable expressions. If I call
an algorithm that internally does an implicitly resumable operation, then
that algorithm internally becomes a coroutine. It becomes resumable.

Which means... I now *must* call that algorithm instantiation from a
resumable context. So either my function itself is `resumable`, or I have
to say `resumable for_each(...)`.

This doesn't invalidate your point, namely that algorithms will deduce how
to properly be `resumable` for their contents. The exact suspend/resume
points will not be defined by the writer of the algorithm, but by the
functions the algorithm actually calls. And there is value to that.

At this point, that is basically the only advantage of resumable
expressions. It's a non-trivial thing to be sure, but I don't see anything
about resumable functions that would prevent you from addressing these
concerns there.

What resumable functions lack relative to resumable expressions are two
things:

1) A way for a function to effectively force the caller to become a
coroutine (implicit await).

2) A way for the caller of a function to *reverse* the implicit `await` of
a function call (that's what `resumable` applied to expressions does).

These features are all it takes to allow for the kind of template code
reuse you're talking about. Though it does make it slightly more
inconvenient than the resumable expressions model, since RE coroutines
don't have return type requirements.

But neither of these is *impossible* with the resumable functions model.
It's simply a matter of finding the best way to add those features in.

And of deciding if we want them at all (that's not necessarily a given).

You cannot have member variables with await either, right?
I don't know what you mean by this.
Post by Germán Diago
These are all tangible benefits from having a function object as a
representation.
... huh? Those benefits have nothing to do with "having a function object
as a representation." Those benefits come from having to declare whether a
function is a coroutine at the function level, rather than in the
function's implementation.

So I'm not seeing how that's a point in resumable expression's favor.
Post by Germán Diago
You can see it: Imagine a deep stack of calls. How much refactoring do you
need in each of the proposals? Imagine code reuse: resumable expressions
can reuse code.
Code reuse in template functions, perhaps. Code reuse elsewhere? Not so
much.
Post by Germán Diago
We cannot say the same about await *unless* I missed something. In the C++
style, I think resumable functions are more well behaved than await, in the
sense that
it is just a function object, you know what it is doing, you could make it
jump point + strictly needed data.
I think the resumable expressions proposal puts the bar very high to the
rest of the proposals, because besides its benefits, you can also implement
what other proposals are proposing.
Yes, you could implement resumable functions on top of resumable
expressions. But that doesn't prove resumable expressions are better. Not
does it prove that anything you could build atop resumable expressions
cannot also be built atop resumable functions.

The thing is that there is one really important fact that cannot be gotten
around.

Resumable functions are a pretty-well proven concept. We have multiple
implementations of them, apparently. We have experience in implementing
them, with an evaluation of optimization opportunities. We have actual
standard wording for it.

Can you tell me that resumable expressions have anywhere *near* the field
experience?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-07 15:21:58 UTC
Permalink
Actually, I misunderstood two things about resumable expressions, both of
which lead me to believe that P0114 is *broken*. Though not irreparably.

*1) Implicit resumable deduction.*

Apparently, implicit resumable deduction happens more often than I thought.
It happens on `inline` functions, member functions in class definitions,
and so forth. Indeed, pretty much the only place where resumable deduction
is not implicit... is in normal functions in .cpp files.

Which leads to breakage #1:

//Some header file.
inline int a_func()
{
return 5;
}

inline int b_func()
{
break resumable;
return 5;
}

//Some cpp file.

static void func()
{
auto x = a_func() + b_func();
internal_global_var += x;
}

void caller()
{
func();
}

OK, where will the compile error point? Well, the compile error will cite
the first line of `func`. Which seems OK. You're calling a resumable
function from a non-resumable context, and the function isn't inline, so it
won't be implicitly deduced. That sounds legitimate.

But wait: what happens if you stick `inline` in front of `func`? It's a
static function, so making it `inline` seems harmless (though silly). Yet
suddenly... the error moves. Now it points at `caller`.

Why? The last time anyone mentioned `resumable` was 2 function levels below
`caller`, and in a completely different file. All because someone thought
they'd help the compiler out with inlining by (mis)using the `inline`
keyword.

The distance between where the last `resumable` was and where the improper
use of a resumable function call is (ie: where the error actually happened)
should be exactly one. That is, one call. The compiler should be able to
point at the call, and the user should be able to see, *from the call
itself*, that this function can't be called in this way.

I should not have to look at the implementation of your code, and the code
you call, ad-infinitium, before I'm able to decide whether and/or how I can
call your function.

If the question is this: should it be possible to *accidentally* write a
coroutine by calling a coroutine? My answer to that should be "only if the
caller can easily see that they've done so." And the only way to do that is
to put something in the function *signature* that says "hey, I'm a
coroutine; if you call me, so are you."

For resumable expressions, that's spelled `resumable`. And therefore, every
resumable function ought to be *explicitly* tagged as such.

Of course, taking away implicit `resumable` deduction breaks the marquee
feature of resumable expressions: the ability for templates to become
coroutines based on what template parameters they're given.

I don't care. The downsides of this approach are not worth the advantages.

`resumable` may not be "viral". But it is just as infectious as await. And
a silent infection is far more pernicious than a noisy one.

*2) Implicit `resumable` deduction oversight.*

I must assume that this is an oversight. Otherwise, the proposal makes no
sense.

The proposal states that a function is implicitly resumable if it calls
`break resumable` or calls any resumable function. *Period*.

What about calling a resumable function within a resumable expression? IE:
`resumable auto x = resumable_func()`. No exception is listed for this;
calling `resumable_func` makes the calling function resumable, period. And
if it's not in an implicitly resumable deduction context, it is a compile
error.

I have to assume that this was simply a mistake. That the section on
implicit deduction was meant to have allowances for resumable expressions,
that any functions called in such an expression effectively don't count.
Otherwise, *main itself* will have to be resumable if you call any
resumable function.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-07 15:37:45 UTC
Permalink
Post by Nicol Bolas
Therefore, the only reason you would need to "put await 7 levels down the
stack" is if you want every function in that call graph to halt when the
top-most function does, and thus return control to the caller "7 levels
down". Correct?
Post by Germán Diago
For resumable, you would need to do it in the 7th level only, and you
would not need to refactor the rest of the code.
That's not true.
As I understood it, maybe I am wrong, the function will suspend inside.
Post by Nicol Bolas
A function marked `resumable` can *only be called* from a resumable
context. This is either another function marked `resumable` or from an
expression marked `resumable`.
resumable int level3()
{
return 5;
}
int level2()
{
return level3(); //Cannot call a resumeable function here.
}
Therefore, *every one of those 7 levels* is going to have to be a
`resumable` function. So every one of those levels will have to mark their
signature with `resumable`. Any code that calls into any one of those 7
levels will have to mark each use of them with `resumable`, or will
themselves have to be coroutines.
So yes, it's just as viral. Only it's worse, because not only do you have
to mark them `resumable`, they *must be inline*.
I am not sure about this limitation. I will take a look again.
The only saving grace you get is that the proposal allows automatic
deduction of resumeable functions. But not everywhere; only in template
code and lambdas. So normal functions don't provide this feature.
Now I understand why it works. I did not catch this at first.
Post by Nicol Bolas
That is 7 vs 1 refactoring. Needless to say the reusability problem that
Post by Germán Diago
Chris exposes in the paper: you cannot reuse algorithms, for example,
with await.
That helps demonstrate the viral nature of resumable expressions. If I
call an algorithm that internally does an implicitly resumable operation,
then that algorithm internally becomes a coroutine. It becomes resumable.
Yes. True. Though, still more reusable than await in this regard.
Post by Nicol Bolas
Which means... I now *must* call that algorithm instantiation from a
resumable context. So either my function itself is `resumable`, or I have
to say `resumable for_each(...)`.
This doesn't invalidate your point, namely that algorithms will deduce how
to properly be `resumable` for their contents. The exact suspend/resume
points will not be defined by the writer of the algorithm, but by the
functions the algorithm actually calls. And there is value to that.
Agree.
Post by Nicol Bolas
At this point, that is basically the only advantage of resumable
expressions. It's a non-trivial thing to be sure, but I don't see anything
about resumable functions that would prevent you from addressing these
concerns there.
What resumable functions lack relative to resumable expressions are two
1) A way for a function to effectively force the caller to become a
coroutine (implicit await).
2) A way for the caller of a function to *reverse* the implicit `await`
of a function call (that's what `resumable` applied to expressions does).
These features are all it takes to allow for the kind of template code
reuse you're talking about. Though it does make it slightly more
inconvenient than the resumable expressions model, since RE coroutines
don't have return type requirements.
But neither of these is *impossible* with the resumable functions model.
It's simply a matter of finding the best way to add those features in.
Would be nice to have those.
Post by Nicol Bolas
And of deciding if we want them at all (that's not necessarily a given).
You cannot have member variables with await either, right?
I mean a member variable that holds a resumable expression, you can have
that, it is a function object.
Can this be done with resumable functions? As far as I understand, but
again, you seem
to understand the proposal better than me, you can only store the result,
but not the object itself.
Post by Nicol Bolas
I don't know what you mean by this.
Post by Germán Diago
These are all tangible benefits from having a function object as a
representation.
If you have a function object, we understand how to save it, reified
(non-type erased) and how to extend that to
a copyable, movable object when it makes sense. That is what I meant.
Post by Nicol Bolas
... huh? Those benefits have nothing to do with "having a function object
as a representation." Those benefits come from having to declare whether a
function is a coroutine at the function level, rather than in the
function's implementation.
Code reuse in template functions, perhaps. Code reuse elsewhere? Not so
much.
All the STL is not a small thing to dismiss... Not to mention all the
template libs there are in the wild.


Yes, you could implement resumable functions on top of resumable
Post by Nicol Bolas
expressions. But that doesn't prove resumable expressions are better. Not
does it prove that anything you could build atop resumable expressions
cannot also be built atop resumable functions.
Maybe it is me only, but I find resumable expressions so easy to translate,
in my head, to what it really is.
I cannot say the same about await. It has its merits also, sure, but
resumable expressions seem simpler to me,
and everything else seems to be composable on top of it.


Can you tell me that resumable expressions have anywhere *near* the field
Post by Nicol Bolas
experience?
Well, the closest thing we can have is creating function objects with
stackless
coroutines: http://www.boost.org/doc/libs/1_59_0/doc/html/boost_asio/overview/core/coroutine.html

That is a function object with resumable feature. I think what a resumable
expression does can fit in my head
the same way a lambda fits.
I cannot say the same when I reason about await. At least not so easily.
And I am still not convinced about
cannot hold it as an object (am I right?) you can only get its result, and
the fact that you must type-erase it
(sure, there are optimizations for that, but for now they are in MS
compiler only?).

Thank you very much for your feedback, it made me understand a few points I
was wrong about.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-07 02:10:13 UTC
Permalink
Post by Germán Diago
2. it can emulate async from a library.
Just FYI: `async` doesn't exist, and hasn't existed in the resumable
functions proposals for quite some time.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-07 03:44:49 UTC
Permalink
Post by Nicol Bolas
Post by Germán Diago
2. it can emulate async from a library.
Just FYI: `async` doesn't exist, and hasn't existed in the resumable
functions proposals for quite some time.
Sorry, I meant await. But it is good that it is disappeared.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-06 14:04:24 UTC
Permalink
Post by Nicol Bolas
How much more performant? Is it enough to be worth arguing about? After
all, most things you'll be using await for won't be cheap operations. Will
you actually *notice* any such performance loss?
You should not expect any performance loss. When applied to concrete
problems, you should expect Coroutines proposal to be as fast or faster
than equivalent solution using resumable expressions.
Post by Nicol Bolas
If it's a choice between forbidding inlining and *forcing* inlining, I'll
accept the overhead of forbidding inlining.
If coroutine lifetime is fully enclosed in the lifetime of the calling
function, then we can
1) elide allocation and use a temporary on the stack of the caller
2) replace indirect calls with direct calls and inline as appropriate:

For example:

auto hello(char const* p) {

while (*p) yield *p++;

}

int main() {

for (auto c : hello("Hello, world"))

putchar(c);

}


Should produce the same code as if you had written:

int main() {

auto p = "Hello, world";

while (*p) putchar(*p++);

}
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-06 14:57:15 UTC
Permalink
Post by Nicol Bolas
If it's a choice between forbidding inlining and *forcing* inlining, I'll
Post by Nicol Bolas
accept the overhead of forbidding inlining.
If coroutine lifetime is fully enclosed in the lifetime of the calling
function, then we can
1) elide allocation and use a temporary on the stack of the caller
How exactly does std::experimental::generator<T> accomplish that? How can
the object know that it is contained entirely in this way? After all, the
promise type is what holds the state, and therefore the promise has to
decide whether to statically or dynamically allocate memory, as well as how
to handle the forwarding to the function to be resumed.

Is `generator` a magical compiler-only type, or could a user somehow
implement these optimizations themselves? Or am I misunderstanding
something about how all this works?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-06 15:57:10 UTC
Permalink
Post by Nicol Bolas
How exactly does std::experimental::generator<T> accomplish that? How can
the object know that it is contained entirely in this way? After all, the
promise type is what holds the state, and therefore the promise has to
decide whether to statically or dynamically allocate memory, as well as how
to handle the forwarding to the function to be resumed.
There are only two magic types in this proposal. coroutine_traits, which
let the compiler figure out which promise_type describes the coroutine
semantics and coroutine_handle<P> which is synthesized by the compiler to
allow resumption and destruction of the coroutine.

If you look at the implementation of coroutine_handle in
<experimental/resumable>, you will notice the following two members:

void coroutine_handle::resume() { _coro_resume(_Ptr); }

void coroutine_handle::destroy() { _coro_destroy(_Ptr); }

_coro_resume and _coro_destroy are intrinsics that are implemented in our
optimizer. After inlining, in the main, optimizer will observe the
following sequence:

$fp = _coro_alloc_elision() ? alloca(_coro_frame_size()) : operator new
(_coro_frame_size()); // frame size of hello$ coroutine
bla
_coro_resume($fp)
bla
_coro_destroy($fp); <-- here

Now optimizer can reason about the lifetime and also to which function
_coro_resume($fp) and _coro_destroy($fp) go.
Since in this example, $fp does not escape. Optimizer replaces
_coro_alloc_elision with 1, thus, allocation is done via alloca(constant)
which optimizer makes into a normal automatic variable. _coro_resume and
_coro_destroy are replaced with direct calls to hello$resume_coro, which
after inlining will lead to what I showed in my previous post.

I talked to Clang implementers and they are planning to add this
optimization too. Note that it is explicitly allowed by

P0057/[dcl.fct.def.coroutine]/8 A coroutine *may* need to allocate memory
to store objects with automatic storage duration local
to the coroutine. *If so*, it shall obtain the storage by calling an
allocation function (3.7.4.1).
The allocation function’s name is looked up in the scope of the promise
type of the coroutine.
If this lookup fails to find the name, the allocation function’s name is
looked up in the global
scope...


In other words, when allocation is needed, here is how the compiler figures
out what allocation functions to use. But, if hte compiler does not need to
allocate, it does not have to.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Vicente J. Botet Escriba
2015-10-04 15:44:19 UTC
Permalink
Post by Gor Nishanov
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
which walks through some of the aspects of P0057 proposal. Note, that the
await syntax is actually quite old. It first appeared as do-notation in
Haskell in 1998 and you may notice that P0057 can be used to perform more
general "monadic" transformations and not only limited to coroutines.
Hmm, await can not work with list as a monad, isn't it?
Bit no proposal is tempting to take care of this case.

Vicente
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-12 05:08:13 UTC
Permalink
Post by Vicente J. Botet Escriba
Post by Gor Nishanov
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
which walks through some of the aspects of P0057 proposal. Note, that the
await syntax is actually quite old. It first appeared as do-notation in
Haskell in 1998 and you may notice that P0057 can be used to perform more
general "monadic" transformations and not only limited to coroutines.
Hmm, await can not work with list as a monad, isn't it?
Bit no proposal is tempting to take care of this case.
If use same underlying technique as was used at macro-based stackless
coroutines of Boost.Asio then it can work with list monad, because such
coroutine is just value type which can be copied/moved.

Here is small live demo of list-moand-like based on stackless coroutines
from Boost.Asio: http://coliru.stacked-crooked.com/a/465f5bcb59c8b0b3
--
Evgeny Panasyuk
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Richard Smith
2015-10-12 18:46:51 UTC
Permalink
Post by Evgeny Panasyuk
Post by Vicente J. Botet Escriba
Post by Vicente J. Botet Escriba
Post by Gor Nishanov
http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
which walks through some of the aspects of P0057 proposal. Note, that the
await syntax is actually quite old. It first appeared as do-notation in
Haskell in 1998 and you may notice that P0057 can be used to perform more
general "monadic" transformations and not only limited to coroutines.
Hmm, await can not work with list as a monad, isn't it?
Bit no proposal is tempting to take care of this case.
If use same underlying technique as was used at macro-based stackless
coroutines of Boost.Asio then it can work with list monad, because such
coroutine is just value type which can be copied/moved.
It doesn't really work; you can't support local variables with such a
model, because their lifetimes could be reentered after they end. P0057 is
fundamentally a coroutines proposal, not a monads proposal, because it does
not support repeated resumption from the same suspension state; I think
this is the right semantic match for an impure language such as C++.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-12 19:19:45 UTC
Permalink
Post by Evgeny Panasyuk
If use same underlying technique as was used at macro-based
stackless coroutines of Boost.Asio then it can work with list monad,
because such coroutine is just value type which can be copied/moved.
It doesn't really work; you can't support local variables with such a
model, because their lifetimes could be reentered after they end.
Local variables do work with technique used by stackless coroutines of
Boost.Asio (and proposals like N4244).

With such approach coroutine is transformed into class. Local variables
are transformed into fields of class (more precisely into nested unions
corresponding to scopes, as described in N4244), and coroutine body is
transformed into method-state-machine, where it's states correspond to
yield points.
This already can be implemented via macros to some extent.

Moreover, C#'s await is implemented based on similar approach:
http://www.codeproject.com/Articles/535635/Async-Await-and-the-Generated-StateMachine
Post by Evgeny Panasyuk
P0057
is fundamentally a coroutines proposal, not a monads proposal, because
it does not support repeated resumption from the same suspension state;
I think this is the right semantic match for an impure language such as C++.
It is intrinsically non-zero overhead, due to type-erasure/allocations.
This fact alone is strong argument against it.

While with approach based on method-state-machine - we can get both:
generality and performance.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-12 19:30:44 UTC
Permalink
Evgeny:

Note the "purity" word in Richard's answer.
If you write a body of the coroutine in a pure manner, you can hack P0057
and in your await_suspend for the list monad resume the coroutine multiple
times.
You need to provide proper final_suspend and return_value to make it work.
But it will work ONLY if your body is pure :-). That is the body of your
coroutine. And you cannot save any state in the awaiter, since it is torn
down at the end of the full expressions, hence, I am using thread_local to
ferry a value from await_suspend to await_resume.

Here is "do not try this at home" awaiter for the list<T>. Untested. Just
an idea of how it can look like.

auto operator await(list<T> const& l) {
struct awaiter {
list<T> const * list_;

static thread_local T* result_;

bool await_ready() { return false; }

void await_suspend(coroutine_handle<> h) {
auto l = list_;
for (auto && item : *l) { result_ = &item; h.resume(); }
// add code to extract the result from the promise and do something
with it.
}

// for every element of the list return the value that we stashed in
thread_local
T const & await_resume() { return *result_; }
}
return awaiter{&l};
}
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Richard Smith
2015-10-12 19:59:26 UTC
Permalink
Post by Gor Nishanov
Note the "purity" word in Richard's answer.
If you write a body of the coroutine in a pure manner, you can hack P0057
and in your await_suspend for the list monad resume the coroutine multiple
times.
You need to provide proper final_suspend and return_value to make it work.
But it will work ONLY if your body is pure :-). That is the body of your
coroutine. And you cannot save any state in the awaiter, since it is torn
down at the end of the full expressions, hence, I am using thread_local to
ferry a value from await_suspend to await_resume.
Here is "do not try this at home" awaiter for the list<T>. Untested. Just
an idea of how it can look like.
auto operator await(list<T> const& l) {
struct awaiter {
list<T> const * list_;
static thread_local T* result_;
bool await_ready() { return false; }
void await_suspend(coroutine_handle<> h) {
auto l = list_;
for (auto && item : *l) { result_ = &item; h.resume(); }
For this to work, I think you'd need your coroutine to (somehow) repeatedly
await the list item. That is, instead of:

list_monad<int> foo(list_monad<int> v) {
int x = await v;
return x * x;
}

... you'd need to write:

list_monad<int> foo(list_monad<int> v) {
loop:
int x = await v;
// somehow return x * x then conditionally goto loop.
}

... because you don't have any kind of call/cc primitive.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-12 20:24:56 UTC
Permalink
Post by Richard Smith
list_monad<int> foo(list_monad<int> v) {
int x = await v;
// somehow return x * x then conditionally goto loop.
}
... because you don't have any kind of call/cc primitive.
Yep. You are are right. Without "checkpointing" of the coroutine state you
would need a loop. But if you had :-) checkpointing, then:
Maybe this:

void await_suspend(coroutine_handle<> h) {
auto l = list_;
auto checkpoint = h.checkpoint();
for (auto && item : *l) { result_ = &item; h.resume(); h.load(
checkpoint); }
// add code to extract the result from the promise and do something
with it.
}

For "pure" functions checkpointing is cheap. The only state they have is
which suspend point they are at.
In no way I am suggesting that we are going to do checkpointing. Just
geeking out.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-13 02:28:35 UTC
Permalink
Post by Gor Nishanov
Note the "purity" word in Richard's answer.
If you write a body of the coroutine in a pure manner, you can hack P0057
and in your await_suspend for the list monad resume the coroutine multiple
times.
Here is an example of how it can be implemented using method-state-machine
macros :
http://coliru.stacked-crooked.com/a/a463b0a5504c7401
COROUTINE(vector<int>, list_demo, (int, param),
(int, local_x)
(int, local_y))
{
AWAIT(local_x =) vector<int>{1,2,3};
AWAIT(local_y =) vector<int>{10, 20, 30};

RETURN(local_x + local_y + param);
}
COROUTINE_END;

int main()
{
auto xs = list_demo{1000}();
for(auto x : xs)
cout << x << " ";
}
// Prints: 1011 1021 1031 1012 1022 1032 1013 1023 1033

There is no purity requirement, and actually coroutine body may contain
imperative loops.

Possible syntax with language support:
vector<int> list_demo(int param)
{
int local_x = await vector<int>{1,2,3};
int local_y = await vector<int>{10, 20, 30};

return local_x + local_y + param;
}
As you can see - it is straightforward syntax transformation from
macro-based version.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Richard Smith
2015-10-12 19:53:38 UTC
Permalink
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
If use same underlying technique as was used at macro-based
stackless coroutines of Boost.Asio then it can work with list monad,
because such coroutine is just value type which can be copied/moved.
It doesn't really work; you can't support local variables with such a
model, because their lifetimes could be reentered after they end.
Local variables do work with technique used by stackless coroutines of
Boost.Asio (and proposals like N4244).
With such approach coroutine is transformed into class. Local variables
are transformed into fields of class (more precisely into nested unions
corresponding to scopes, as described in N4244), and coroutine body is
transformed into method-state-machine, where it's states correspond to
yield points.
This already can be implemented via macros to some extent.
I think you've missed my point about object lifetime. Consider:

list_monad<int> f(list_monad<int> ints) {
{
auto x = make_shared<int>(42);
auto &r = x;
int y = await ints; // #1, suppose this behaves like a list monad
cout << *r + y;
} // #2
return 0;
}

No matter how you transform this into a class, it won't actually work (and
rightly so): the lifetime of the x object ends the first time line #2 is
reached. When you try to resume at line #1, there's no way to bring x back
to life again. Now, you might suggest that the way to solve this is to make
a copy of the monad state at the point where we hit the 'await', so you can
"safely" resume it multiple times. But that doesn't work either: your
copy's 'r' would refer to the original's 'x' (whose lifetime has ended),
not to the copy's 'x'.
Post by Evgeny Panasyuk
http://www.codeproject.com/Articles/535635/Async-Await-and-the-Generated-StateMachine
The implementation approach is fine for coroutines (C#'s await doesn't
support the list monad / continuations), but doesn't work for the full
generality of monads in a system with mutable state.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-12 21:08:15 UTC
Permalink
Post by Richard Smith
list_monad<int> f(list_monad<int> ints) {
No matter how you transform this into a class, it won't actually work
(and rightly so): the lifetime of the x object ends the first time line
#2 is reached. When you try to resume at line #1, there's no way to
bring x back to life again. Now, you might suggest that the way to solve
this is to make a copy of the monad state at the point where we hit the
'await', so you can "safely" resume it multiple times. But that doesn't
work either: your copy's 'r' would refer to the original's 'x' (whose
lifetime has ended), not to the copy's 'x'.
Thank you for detailed description, I get your point.
I am aware of this issue, and I agree that it can lead to subtle bugs -
because such code works in unintuitive/unaccustomed manner.

Nevertheless, I don't think that possibility of such bugs makes whole
approach non-usable. I think it is acceptable price for performance and
generality/features/power it provides.
C++ was never a defensive language.

For instance I want to use non-owning raw pointers, and I accept the
price of increased possibility of memory corruption. And if one needs
higher defensiveness - it is possible to use shared/weak_ptr in
casual/wasteful manner.

Same applies here - I want to copy/move/fork/serialize/etc coroutines,
and I agree to pay for possibility of problems with locals lifetime
issues. But if someone would like to avoid such issues, and do not need
fork/etc - then he could use non-copyable non-movable coroutines
allocated on heaps.
Post by Richard Smith
http://www.codeproject.com/Articles/535635/Async-Await-and-the-Generated-StateMachine
The implementation approach is fine for coroutines (C#'s await doesn't
support the list monad / continuations),
Yes, C# await doesn't support copy of state (at least in straightforward
way).
My point here is that approach based on such kind of transformation
(coroutine body into method-state-machine, locals to class fields) is
already implemented and used in one of mainstream languages.
Post by Richard Smith
but doesn't work for the full
generality of monads in a system with mutable state.
Why? As I can see copying of coroutine is similar to call/cc, which in
turn is somewhat dual to monads.
By the way, I did an example sometime ago how to use call/cc to get
"monadic flow", including List monad: http://ideone.com/7uOVe2
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-12 22:18:24 UTC
Permalink
Post by Evgeny Panasyuk
Same applies here - I want to copy/move/fork/serialize/etc coroutines,
and I agree to pay for possibility of problems with locals lifetime
issues.
Write a proposal. It is trivial to add, clone(), checkpoint(), save(),
restore() members to coroutine_handle<>. Coroutine handle is just a
pointer to a blob of memory representing the current state of the coroutine.

In fact, you can probably hack it up today using VS 2015 RTM, by providing
your own allocator and learning the size and location of the memory block
representing the coroutine state. Once you now it, you can pretty much do
whatever you want with it.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-12 22:47:17 UTC
Permalink
Post by Evgeny Panasyuk
Same applies here - I want to copy/move/fork/serialize/etc coroutines,
and I agree to pay for possibility of problems with locals lifetime
issues.
Write a proposal. It is trivial to add, clone(), checkpoint(), save(),
restore() members to coroutine_handle<>. Coroutine handle is just a
pointer to a blob of memory representing the current state of the coroutine.
My main concern about P0057R0 is type-erasure - it is far from being
zero-overhead.
And as I can see - stackless coroutine can be implemented without such
type-erasure. It's size is known at compile-time - and it can be just
normal type with all data contained within it's sizeof - there is no
need for any special allocation/deallocation of "remote" parts.
Post by Evgeny Panasyuk
In fact, you can probably hack it up today using VS 2015 RTM, by
providing your own allocator and learning the size and location of the
memory block representing the coroutine state. Once you now it, you can
pretty much do whatever you want with it.
Well, same is possible with stackful coroutines and even with threads.
But it is just chunk of raw bits. And all type info required to do
proper copy/move (not just memcpy) is removed during compilation.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-12 23:03:34 UTC
Permalink
Post by Evgeny Panasyuk
My main concern about P0057R0 is type-erasure - it is far from being
zero-overhead.
I have had an outstanding challenge for a year already to anyone who thinks
that way to come up with a real world problem, reduce it to managable size
(say async_tcp_reader) write it up it both ways using P0057 and whatever
you consider zero overhead and evaluate on three criteria:

1) How much code end-user have to write
2) How much library support required
3) What is an abstraction penalty, how many instructions need to get
executed to get from, say, await Read(buf, len) to an low-level
API/hardware, say WSARecv

My statement is that P0057 is as good or better on all 3 criteria than any
other proposal I've seen. If you want to accept the challenge, write up an
equivalent to TcpReader described in one of these two presentations:

Compared to hand-crafted state machines using callbacks, P0057 has negative
overhead.
See: https://github.com/CppCon/CppCon2015/blob/master/Presentations/C%2B%2B%20Coroutines/C%2B%2B%20Coroutines%20-%20Gor%20Nishanov%20-%20CppCon%202015.pdf

or

http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4287.pdf
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-13 21:37:43 UTC
Permalink
Post by Evgeny Panasyuk
My main concern about P0057R0 is type-erasure - it is far from being
zero-overhead.
I have had an outstanding challenge for a year already to anyone who
thinks that way to come up with a real world problem, reduce it to
managable size (say async_tcp_reader) write it up it both ways using
P0057 and whatever you consider zero overhead and evaluate on three
Example of real world problem is generator/yield.
An extra allocation here results in significant overhead. Even if some
kind of "small object optimization" scheme is used - it is still not
zero overhead.
Post by Evgeny Panasyuk
1) How much code end-user have to write
Code is very similar in both cases.
Post by Evgeny Panasyuk
2) How much library support required
Nearly the same. Maybe some additional customization points.
Post by Evgeny Panasyuk
3) What is an abstraction penalty, how many instructions need to get
executed to get from, say, await Read(buf, len) to an low-level
API/hardware, say WSARecv
Because of concrete types instead of type-erasure (and allocations) -
abstraction penalty is much lower in subset of cases, like generators.
Execution path is very similar, perhaps even less due to less indirections.
Post by Evgeny Panasyuk
My statement is that P0057 is as good or better on all 3 criteria than
any other proposal I've seen. If you want to accept the challenge, write
There are use-cases where allocation is OK. For instance in case of
large structure and/or because it is moved around many times. Your
example shows exactly this. But it is not the only one use case for
stackless coroutines.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-13 22:22:39 UTC
Permalink
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
My main concern about P0057R0 is type-erasure - it is far from being
zero-overhead.
I have had an outstanding challenge for a year already to anyone who
thinks that way to come up with a real world problem, reduce it to
managable size (say async_tcp_reader) write it up it both ways using
P0057 and whatever you consider zero overhead and evaluate on three
Example of real world problem is generator/yield.
An extra allocation here results in significant overhead. Even if some
kind of "small object optimization" scheme is used - it is still not
zero overhead.
Except that he's already proven (in this thread no less) that a good
optimizer can elide the allocation. If the compiler can reasonably *make*
it zero overhead, then it *is* zero overhead.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-13 23:45:41 UTC
Permalink
Post by Nicol Bolas
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
My main concern about P0057R0 is type-erasure - it is far from
being
Post by Evgeny Panasyuk
zero-overhead.
I have had an outstanding challenge for a year already to anyone who
thinks that way to come up with a real world problem, reduce it to
managable size (say async_tcp_reader) write it up it both ways using
P0057 and whatever you consider zero overhead and evaluate on three
Example of real world problem is generator/yield.
An extra allocation here results in significant overhead. Even if some
kind of "small object optimization" scheme is used - it is still not
zero overhead.
Except that he's already proven (in this thread no less) that a good
optimizer can elide the allocation. If the compiler can reasonably *make*
it zero overhead, then it *is* zero overhead.
1. It is impossible (practically) in general case.
For instance in case when we put coroutines in container, like:
vector<coroutine> x(N);
In case of coroutines with concrete types and sizeof known at compile -
this can be done within single allocation.
But if coroutine type is erased the we will have N+1 allocations in general
case - it can't be practically elided.

2. Even if consider only functions scopes - escape analysis would not give
100% guarantee for elision in every case. First of all - I think it would
hit halting problem, second - some of functions in call tree may not be
inlined for adequate reasons - and this would blind analysis.

3. This would put additional burden on implementers, and I don't see
reasonable benefits which we get for such burden.

4. C++11 has lambdas with concrete type - this ensures zero overhead, and
fits naturally into language. We don't have type-erasured closures. We can
use external type erasure like std::function when needed.
Why we should have type-erasure for stackless coroutines?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 00:03:55 UTC
Permalink
Please see this thread from the last year.

https://groups.google.com/a/isocpp.org/forum/?fromgroups#!searchin/std-proposals/resumable$20lambdas/std-proposals/_ssbHm4C2t8/VnQF63au8U0J

Make sure to read to the point where Ville said: "Ouch" and what followed
afterwards.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 00:11:57 UTC
Permalink
Okay. Slightly less cryptic reply. Coroutine frame must be stationary once
the coroutine starts running.
Resumable Expressions abandoned movability/copyability of the lambda* that
was present in earlier resumable lambda proposal. Due to the reasons
highlighted in the thread I linked earlier. Thus, the resumable expressions
is in exactly the same boat as P0057.

The difference is that in Resumable Expressions you must do type erasure by
hand which is difficult to eliminate.
Whereas in P0057 compiler decides whether it needs to do type erasure or
not, thus, allowing to optimize it out when unnecessary.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 00:25:39 UTC
Permalink
Post by Gor Nishanov
Whereas in P0057 compiler decides whether it needs to do type erasure or
not, thus, allowing to optimize it out when unnecessary.
What about "vector<coroutine>(N)" use case? As I can see - overhead of N
allocations can't be elided automatically.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 00:34:34 UTC
Permalink
Post by Evgeny Panasyuk
What about "vector<coroutine>(N)" use case? As I can see - overhead of N
allocations can't be elided automatically.
How can P0114 help you with that? Coroutine state is uncopyable and
unmovable there as well.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 02:18:36 UTC
Permalink
Post by Evgeny Panasyuk
What about "vector<coroutine>(N)" use case? As I can see - overhead of N
allocations can't be elided automatically.
How can P0114 help you with that? Coroutine state is uncopyable and
unmovable there as well.
It does not erase type:
"
No hidden memory allocations. The memory representation of a resumable
expression can be wherever you need it: on the stack, a global, or a
member of a possibly heap-allocated object.
"

And there is possibility for copyability:
"
10.1 Allowing copyability
By allowing copyability of resumable objects, we enable interesting use
cases such as undo stacks. Although this behaviour comes with risk
associated with aliasing of local variables, an explicit opt in may be
feasible.
"

But, even if there would be no copyability/moveability, and as the
result we cannot use std::vector - we still can place N coroutines into
array with single allocation: make_unique<coroutine[]>(N)
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 02:31:33 UTC
Permalink
It forces you to do it by hand, as you can see this in both generator and
async examples. The key is it requires stationary frame and for that you
either need to heap allocate or keep the lifetime of the coroutine fully
enclosed in the lifetime of its consumer. Exactly the same case where heap
elision is done.
Post by Evgeny Panasyuk
But, even if there would be no copyability/moveability, and as the
result we cannot use std::vector - we still can place N coroutines into
array with single allocation: make_unique<coroutine[]>(N)
P0057 allows you to customize allocation, thus you can achieve the same
goal, but with different means.
I do not claim that feature lists are identical for P0057 and P0114. I
claim that for a complicated problem, like async programming, for example,

P0057 solution will results in:

1) less user written code
2) less library support code
3) less abstraction overhead (see TcpReader, for example)

Than a solution to the same problem in P0114.
If you want to argue superiority of P0114, pick a problem (hint, hint async
programming) write a solution, compare with equivalent of P0057.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 03:46:23 UTC
Permalink
Post by Gor Nishanov
It forces you to do it by hand, as you can see this in both generator
and async examples.
Not "by hand" - this type-erasure can be within standard library. Just
like std::function. And it is not hard at all to use lambdas with
std::function, when needed.
Post by Gor Nishanov
The key is it requires stationary frame and for that
you either need to heap allocate or keep the lifetime of the coroutine
fully enclosed in the lifetime of its consumer. Exactly the same case
where heap elision is done.
1. Even if consider only functions scopes - escape analysis would not
give 100% guarantee for elision in every case. First of all - I think it
would hit halting problem, second - some of functions in call tree may
not be inlined for adequate reasons - and this would blind analysis.
While in cases of concrete coroutine type - no analysis is required at all.

2. Coroutine locals can be copied/moved with coroutine itself - this is
already implementable via macros. P0144 does not exclude copying
possibility.
But even if add .clone() to P0057 - we still would have erased type.

3. Again, consider case when coroutine is stored in structure or array
like: make_unique<coroutine[]>(N).
Yes, it will be in heap, but each coroutine would not be allocated
separately, it will be just one allocation.
Same applies to structures. For instance if we have:
struct Foo
{
coroutine x;
// ...
};
and then do make_unique<Foo>() - then in case of type-erased coroutine
there will be two allocations, but for coroutine with concrete type -
just one.
Post by Gor Nishanov
But, even if there would be no copyability/moveability, and as the
result we cannot use std::vector - we still can place N coroutines into
array with single allocation: make_unique<coroutine[]>(N)
P0057 allows you to customize allocation, thus you can achieve the same
goal, but with different means.
1. Even with custom allocation, in case of array of coroutines there
will be O(N) overhead vs possible O(1).

2. You still need place somewhere for allocation buffer. And things
complicated (resulting in overhead) by fact that you do not know at
compile time how much each coroutine would take.
Post by Gor Nishanov
I do not claim that feature lists are identical for P0057 and P0114. I
claim that for a complicated problem, like async programming, for example,
1) less user written code
2) less library support code
3) less abstraction overhead (see TcpReader, for example)
Than a solution to the same problem in P0114.
If you want to argue superiority of P0114, pick a problem (hint, hint
async programming) write a solution, compare with equivalent of P0057.
I am not talking specifically about P0114. Right now I am talking about
stackless coroutines with concrete non-type-erased types. And this is
possible even for P0057, without any major syntax change.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
German Diago
2015-10-14 07:18:49 UTC
Permalink
Not "by hand" - this type-erasure can be within standard library. Just like
std::function. And it is not hard at all to use lambdas with std::function,
when needed.

Completely agree. This is the way it should be done, IMHO. Not embedding it
into the language for no gain.
Post by Gor Nishanov
I do not claim that feature lists are identical for P0057 and P0114. I
Post by Gor Nishanov
claim that for a complicated problem, like async programming, for example,
1) less user written code
This is due to embedding more into the language. The resumable expressions
can do all of it in libraries.
So as a solution, I find it superior a library than embedding into the
language
Post by Gor Nishanov
2) less library support code
Again, because it is embedded.
Post by Gor Nishanov
3) less abstraction overhead (see TcpReader, for example)
Post by Gor Nishanov
This is a carefully chosen use-case. There are many others. Though, it is
a real one, I cannot say it is not.
Post by Gor Nishanov
Than a solution to the same problem in P0114.
Post by Gor Nishanov
If you want to argue superiority of P0114, pick a problem (hint, hint
async programming) write a solution, compare with equivalent of P0057.
You need to write less code does not mean that it is zero-overhead, which
is a main point of the discussion.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
German Diago
2015-10-14 07:12:19 UTC
Permalink
Post by Gor Nishanov
Okay. Slightly less cryptic reply. Coroutine frame must be stationary once
the coroutine starts running.
Resumable Expressions abandoned movability/copyability of the lambda* that
was present in earlier resumable lambda proposal.
The new paper mentions about opt-in to move/copy. Could be proposed. I find
it useful.
Post by Gor Nishanov
The difference is that in Resumable Expressions you must do type erasure
by hand which is difficult to eliminate.
Whereas in P0057 compiler decides whether it needs to do type erasure or
not, thus, allowing to optimize it out when unnecessary.
What I really like about resumable expressions is that it is really, really
obvious and lightweight how it works.
Your implementation, Gor, looks good to me. But I am concerned we can avoid
some of the trouble. I do not see a problem
in having library-abstracted generators on top of resumable expressions.
Why should we embed a full protocol in the language itself
when we can get it done with only "break resumable" and have the rest on
top of library abstractions. I just do not get why,
because additionally, you can have your generators, your await, everything,
and remove type erasure and have *real* zero overhead
from the beginning. Without any fancy optimizations or escape analysis. I
am against introducing in the language something that
is not inherently zero-overhead when we have alternatives. Do not get me
wrong, the proposal gives a lot of inspiration, in my opinion,
for how to do a few great things. But I honestly think we can do better.

I know about your suggestion on how to compare, but I simply do not have
enough time. I hope I had, but I am short on time.
I see resumable expressions more understandable respect to the traditional
c++ model and I think they guarantee zero-overhead
in more cases than your proposal. Though, I recognize that the numbers you
show for your implementation look good, but, again:

1. need compiler optimizations such as escape analysis.
2. no matter the way you put it, they are not inherently zero-overhead for
the curent state of the art. Even you mentioned clang
is planning to introduce some optimization that is not available already. I
think we should not get into that trouble,
we have alternatives. There are more compilers around also: Intel, IBM...
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 00:20:18 UTC
Permalink
Post by Gor Nishanov
Please see this thread from the last year.
https://groups.google.com/a/isocpp.org/forum/?fromgroups#!searchin/std-proposals/resumable$20lambdas/std-proposals/_ssbHm4C2t8/VnQF63au8U0J
Make sure to read to the point where Ville said: "Ouch" and what
followed afterwards.
Thanks for link. I already commented on this issue in current topic:
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/L5ZsY1SYnrA/kGXSVV4RDgAJ

In short, yes - I agree, it could lead to subtle bugs.
But same applies to raw non-owning pointers - they could lead to subtle
bugs too, but we still use them.
And in general: we do not ban square root just because we have negative
numbers.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 00:32:47 UTC
Permalink
You can add an ability to clone() and restore() the coroutine state on top
of either proposals.
If someone feel strongly about it, he/she can write and submit a proposal
to make it happen.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 00:44:44 UTC
Permalink
Post by Gor Nishanov
You can add an ability to clone() and restore() the coroutine state on
top of either proposals.
If someone feel strongly about it, he/she can write and submit a
proposal to make it happen.
clone/restore are just desirable features. Yes, maybe they can be added
separately.
But I am talking about allocation overhead, which exists here due to
erased type - it is separate issue. If we would have erased type in ISO
- it will be there for a long time, and it would be hard to make it fix it.

For instance, I expect that generators would be fast small zero (or at
least almost zero) overhead things - for instance like transform
iterator. I do not want transform iterator which is allocated on heap.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 01:57:55 UTC
Permalink
Post by Evgeny Panasyuk
Post by Nicol Bolas
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
My main concern about P0057R0 is type-erasure - it is far from
being
Post by Evgeny Panasyuk
zero-overhead.
I have had an outstanding challenge for a year already to anyone who
thinks that way to come up with a real world problem, reduce it to
managable size (say async_tcp_reader) write it up it both ways using
P0057 and whatever you consider zero overhead and evaluate on three
Example of real world problem is generator/yield.
An extra allocation here results in significant overhead. Even if some
kind of "small object optimization" scheme is used - it is still not
zero overhead.
Except that he's already proven (in this thread no less) that a good
optimizer can elide the allocation. If the compiler can reasonably *make*
it zero overhead, then it *is* zero overhead.
1. It is impossible (practically) in general case.
vector<coroutine> x(N);
In case of coroutines with concrete types and sizeof known at compile -
this can be done within single allocation.
But if coroutine type is erased the we will have N+1 allocations in
general case - it can't be practically elided.
Ignoring the rest of the discussion on this point, I never claimed that
P0057 could guarantee elision in the case you present here. Before, you
asked about a *specific* problem, and I answered with a specific example
showing that it was elidable. What you've shown here hardly disproves my
point.

Also... how does `vector<coroutine>` make any kind of sense with regard to
P0114? The type isn't type erased, so each coroutine has its own type.
Therefore, in order to put them in a homogeneous container like `vector`,
you'll have to type-erase them. Which requires memory allocation.

At which point, your version gains *nothing* over P0057.
Post by Evgeny Panasyuk
2. Even if consider only functions scopes - escape analysis would not give
100% guarantee for elision in every case. First of all - I think it would
hit halting problem, second - some of functions in call tree may not be
inlined for adequate reasons - and this would blind analysis.
P0114 *requires* that all resumable functions you call are inlined. If
they're not inlined, you have to manually box them (and the boxing function
is no longer resumable). Boxing involves type erasure. And as previously
stated, memory allocation.

In order for P0114 to not require the same allocations as P0057, you must
be using resumable functions directly, without boxing. So they must be
inline. And therefore, your second problem is a non-issue for comparable
cases: the compiler for the TU has access to all relevant code.

The only question that remains is this: given full inlining, where does the
optimizer break down?

Do you have any actual knowledge that it breaks down in common cases? Or
can a smart one cover 80-90% of these cases? Stop talking theory as though
this weren't an idea that has already been implemented on at least one
compiler.
Post by Evgeny Panasyuk
3. This would put additional burden on implementers, and I don't see
reasonable benefits which we get for such burden.
No, it doesn't. Or rather, it's the same burden, it's just in a different
place.

P0114 requires implementations to go through whole hierarchies of inline
function calls and generate types that represent their stacks. It puts a
lot of burden on implementer too; it's just in the implementation of the
feature rather than the *optimization* phase.

It's more or less the same work either way. Though admittedly, the P0114
does make it a bit easier for the compiler to see it.

4. C++11 has lambdas with concrete type - this ensures zero overhead, and
Post by Evgeny Panasyuk
fits naturally into language. We don't have type-erasured closures. We can
use external type erasure like std::function when needed.
Why we should have type-erasure for stackless coroutines?
Because implementing await machinery (promises, awaitable, etc) is hard
enough as it is. Adding a template on top of everything only makes things
harder.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 03:17:43 UTC
Permalink
Post by Nicol Bolas
Except that he's already proven (in this thread no less) that a
good optimizer can elide the allocation. If the compiler can
reasonably /make/ it zero overhead, then it /is/ zero overhead.
1. It is impossible (practically) in general case.
|
vector<coroutine>x(N);
|
In case of coroutines with concrete types and sizeof known at
compile - this can be done within single allocation.
But if coroutine type is erased the we will have N+1 allocations in
general case - it can't be practically elided.
Ignoring the rest of the discussion on this point, I never claimed that
P0057 could guarantee elision in the case you present here. Before, you
asked about a /specific/ problem, and I answered with a specific example
showing that it was elidable. What you've shown here hardly disproves my
point.
It is not zero overhead even with good optimizer/compiler, because they
can't elide every allocation, and I am not talking about some exotic cases.
Post by Nicol Bolas
Also... how does `vector<coroutine>` make any kind of sense with regard
to P0114? The type isn't type erased, so each coroutine has its own
type. Therefore, in order to put them in a homogeneous container like
`vector`, you'll have to type-erase them. Which requires memory allocation.
At which point, your version gains /nothing/ over P0057.
Same coroutines have same concrete types. For instance, with P0114 it
may be:
|
struct concrete_coroutine
{
resumable auto r = expression;
// ...
};
...
make_unique<concrete_coroutine[]>(N);
|

For example, imagine some kind of TCP server, coroutine for each
incoming connection does same job, has same locals, and as the
consequence they have same type.

LIVE DEMO using macro-based stackless coroutines from Boost.Asio:
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
Post by Nicol Bolas
2. Even if consider only functions scopes - escape analysis would
not give 100% guarantee for elision in every case. First of all - I
think it would hit halting problem, second - some of functions in
call tree may not be inlined for adequate reasons - and this would
blind analysis.
P0114 /requires/ that all resumable functions you call are inlined. If
they're not inlined, you have to manually box them (and the boxing
function is no longer resumable). Boxing involves type erasure. And as
previously stated, memory allocation.
In order for P0114 to not require the same allocations as P0057, you
must be using resumable functions directly, without boxing. So they must
be inline. And therefore, your second problem is a non-issue for
comparable cases: the compiler for the TU has access to all relevant
code.


It requires inlining of resumable things which are inside body of
resumable functions.
Outside of resumable context you can .resume() coroutine without inlining.
Post by Nicol Bolas
The only question that remains is this: given full inlining, where does
the optimizer break down?
For instance, when you store coroutine in some container, not an unusual
case.
Post by Nicol Bolas
Stop talking theory as
though this weren't an idea that has already been implemented on at
least one compiler.
I am not talking specifically about P0114. I am talking about stackless
coroutines with concrete non-type-erased types.

And this is already implementable in macro-library form. I already
showed several examples in current topic, even await for List Monad. And
these example are live - you can test them - modify and recompile it
with your browser just by several mouse clicks - it is not just a
theory, it is proven to work approach.
Post by Nicol Bolas
3. This would put additional burden on implementers, and I don't see
reasonable benefits which we get for such burden.
No, it doesn't. Or rather, it's the same burden, it's just in a
different place.
It is not the same burden. Stackless coroutines with concrete types,
P0114, as well as macro-based solutions - do not need special tricky
allocation elision, they just don't do any allocation in a first place.
Post by Nicol Bolas
P0114 requires implementations to go through whole hierarchies of inline
function calls and generate types that represent their stacks. It puts a
lot of burden on implementer too; it's just in the implementation of the
feature rather than the /optimization/ phase.
It's more or less the same work either way. Though admittedly, the P0114
does make it a bit easier for the compiler to see it.
Again, I am not talking specifically about P0114. Even P0057 can be
changed to have concrete coroutine type.
Post by Nicol Bolas
4. C++11 has lambdas with concrete type - this ensures zero
overhead, and fits naturally into language. We don't have
type-erasured closures. We can use external type erasure like
std::function when needed.
Why we should have type-erasure for stackless coroutines?
Because implementing await machinery (promises, awaitable, etc) is hard
enough as it is.
It is implementable to some extent even which macros, but with not
pretty syntax.
Post by Nicol Bolas
Adding a template on top of everything only makes
things harder.
Which template? Do you mean mandatory inlining in P0114? It requires
this inlining in order to solve orthogonal and harder problem, not
problem of allocations.
N4244, somewhat predecessor of P0114 - also does not force type-erasure
and allocation, but it does not require inlining you are referring to.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 03:50:15 UTC
Permalink
Post by Evgeny Panasyuk
I am not talking specifically about P0114. I am talking about stackless
coroutines with concrete non-type-erased types.
The starting point of my design was a lambda* with the properties you
describe. When applied to problems I needed solving I found it
unsatisfactory and therefore went with N4134 proposal. That does not mean
that at some point, somebody won't be able to invent a better lambda* and
get it standardized.

P0114, P0057 and lambda* are all powered by the same underlying machinery.
A transformation of a state machine written in imperative fashion into an
actual state machine. The difference is in a public face of the state
machine. You just need to figure out compelling use-cases and sane
semantics that are not already covered efficiently by P0057 and write a
proposal. I cannot do it for you as the problems I need solving, namely
async I/O and async programming in general, are addressed by P0057
succinctly and efficiently.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 04:14:59 UTC
Permalink
Post by Gor Nishanov
Post by Evgeny Panasyuk
I am not talking specifically about P0114. I am talking about stackless
coroutines with concrete non-type-erased types.
The starting point of my design was a lambda* with the properties you
describe. When applied to problems I needed solving I found it
unsatisfactory and therefore went with N4134 proposal. That does not mean
that at some point, somebody won't be able to invent a better lambda* and
get it standardized.
Well, this is also the problem. If we already would have some stackless
coroutines in ISO - it would be much harder to get additional one into it.
Post by Gor Nishanov
P0114, P0057 and lambda* are all powered by the same underlying machinery.
A transformation of a state machine written in imperative fashion into an
actual state machine. The difference is in a public face of the state
machine. You just need to figure out compelling use-cases and sane
semantics that are not already covered efficiently by P0057 and write a
proposal. I cannot do it for you as the problems I need solving, namely
async I/O and async programming in general, are addressed by P0057
succinctly and efficiently.
If you need only async I/O - yes, I could imagine that extra allocation is
tolerable in such context. But P0057 describes not only async I/O - but
also for instance generators. And for generators (like transform iterators)
an extra allocation is huge price.
It is acceptable price for languages like C# (especially taking into
account fast happy-path allocations in first generation of copying GC), but
it is definitely not acceptable for C++ which has costly default
allocations and ambitious zero-overhead goal.

And if you consider only performance of async I/O use cases - then I think
proposal should directly reflect this somehow.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 04:51:07 UTC
Permalink
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra allocation is
tolerable in such context. But P0057 describes not only async I/O - but
also for instance generators. And for generators (like transform iterators)
an extra allocation is huge price.
A price you will never pay because it will be elided.

Please stop repeating statements that have been disproven; it's not helping
your case. You have yet to post an example of a generator that would not be
elided.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 05:13:44 UTC
Permalink
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra
allocation is tolerable in such context. But P0057 describes not
only async I/O - but also for instance generators. And for
generators (like transform iterators) an extra allocation is huge price.
A price you will never pay because it will be elided.
Please stop repeating statements that have been disproven; it's not
helping your case. You have yet to post an example of a generator that
would not be elided.
I already described it several times, just put generator into some
structure/array or return somewhere.
In similar situation:
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.

You can read what Gor said previously in this topic:


"If coroutine lifetime is fully enclosed in the lifetime of the calling
function, then we can
1) elide allocation and use a temporary on the stack of the caller
2) replace indirect calls with direct calls and inline as appropriate:"


There is "if" condition. Even if we assume that optimizers always do
elision when condition is true, there is still no elision for cases with
false condition.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 13:47:09 UTC
Permalink
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra
allocation is tolerable in such context. But P0057 describes not
only async I/O - but also for instance generators. And for
generators (like transform iterators) an extra allocation is huge
price.
Post by Evgeny Panasyuk
A price you will never pay because it will be elided.
Please stop repeating statements that have been disproven; it's not
helping your case. You have yet to post an example of a generator that
would not be elided.
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.

Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
*impossible* to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or anything
else, it simply *cannot happen*.

Templates will not save you, because you can't do this:

Coroutine<resumable {expr}>.

You can't even do this:

using coroutine_type = decltype(resumable {expr});

Why do these fail? Because each separate `expr`, even if it's technically
the exact same function, will result in a different type. Just like a
lambda, copying-and-pasting the expression will yield a different type.

What you're suggesting is impossible. Or at least, it's impossible without
trickery (ie: macros).

So Boost.Asio is either doing type erasure or it is cheating. Any core
feature will not be allowed to cheat, so you'll *have* to use type erasure
to store the result of such an operation.
Post by Evgeny Panasyuk
"If coroutine lifetime is fully enclosed in the lifetime of the calling
function, then we can
1) elide allocation and use a temporary on the stack of the caller
2) replace indirect calls with direct calls and inline as appropriate:"
There is "if" condition. Even if we assume that optimizers always do
elision when condition is true, there is still no elision for cases with
false condition.
The only way you could avoid a dynamic allocation while still leaving the
lifetime of the calling function is if you could copy/move the coroutine
type. And that's just not reasonable.

Or rather, whether a coroutine is movable is based *entirely* on its
implementation. An implementation that you may not have access to. So how
could the compiler possibly know that function X will return an immobile
type while function Y returns a mobile one?

Of course, for P0114, the question is moot: all resumable functions must be
inline, and thus any code catching them will know whether they're mobile or
not. But for any other suggestion, the question still stands: if the
function isn't inline, how do you know if the coroutine type is mobile?

This is a question that has yet to yield a satisfactory answer. P0114 says
that you have to box non-inline functions, which means you *always* pay
overhead for them, unlike P0057, which allows the possibility of optimizing
the overhead based on usage.

Without a plan to deal with coroutine type mobility for non-inline cases,
there's no reason to talk about what happens if a non-erased coroutine type
escapes its owning function.

P0057 has a plan for this. You don't thus far.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Giovanni Piero Deretta
2015-10-14 15:16:34 UTC
Permalink
Post by Nicol Bolas
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra
allocation is tolerable in such context. But P0057 describes not
only async I/O - but also for instance generators. And for
generators (like transform iterators) an extra allocation is huge
price.
Post by Evgeny Panasyuk
A price you will never pay because it will be elided.
Please stop repeating statements that have been disproven; it's not
helping your case. You have yet to post an example of a generator that
would not be elided.
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.
Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
*impossible* to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or anything
else, it simply *cannot happen*.
It is trivially possible. The name is unimportant, only the type is. In
this example a lambda stands for an unnamed type. I could have used other
unnamed types.

template<class Impl> struct Coroutine { Impl body; };

auto createACoroutine()
{
auto coro = [] {...};
Coroutine<decltype(coro)> coroWrapper = {coro};
return coroWrapper;
}


-- gpd
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 15:31:39 UTC
Permalink
On Wednesday, October 14, 2015 at 11:16:35 AM UTC-4, Giovanni Piero Deretta
Post by Giovanni Piero Deretta
Post by Nicol Bolas
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra
allocation is tolerable in such context. But P0057 describes not
only async I/O - but also for instance generators. And for
generators (like transform iterators) an extra allocation is huge
price.
Post by Evgeny Panasyuk
A price you will never pay because it will be elided.
Please stop repeating statements that have been disproven; it's not
helping your case. You have yet to post an example of a generator that
would not be elided.
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.
Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
*impossible* to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or anything
else, it simply *cannot happen*.
It is trivially possible. The name is unimportant, only the type is. In
this example a lambda stands for an unnamed type. I could have used other
unnamed types.
template<class Impl> struct Coroutine { Impl body; };
auto createACoroutine()
{
auto coro = [] {...};
Coroutine<decltype(coro)> coroWrapper = {coro};
return coroWrapper;
}
OK, yes you can do that. My mistake.

However, that code is not a complete example. It doesn't match with your
sample code (which currently uses type erasure/chicanery). So what does the
non-trick version look like?

const int numCoroutines = 10000;
std::vector<What> v;
v.reserve(10000);
for(int i : range(0, 10000))
v.emplace_back(createACoroutine);

What goes in `What`?

Oh sure, you can do it if you invert it and wrap it in a function call:

auto createCoroutineVector(const int numCoroutines)
{
std::vector<decltype(createACoroutine())> v;
v.reserve(numCoroutines);
for(int i : range(0, 10000))
v.emplace_back(createACoroutine);
return v;
}

But this now forces every piece of code to use template deduction to
interact with this data. This makes the code a lot less readable.

Just look at this example for inscrutability. To be able to know what to do
with the return value, I have to look through two function definitions. The
more layers between the source type and the destination, the less readable
the code gets.

I would much rather have a genuine type and type-erasure than to have to
search through 5 function calls just to figure out what a type is supposed
to be.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Giovanni Piero Deretta
2015-10-14 15:42:41 UTC
Permalink
Post by Nicol Bolas
On Wednesday, October 14, 2015 at 11:16:35 AM UTC-4, Giovanni Piero
Post by Giovanni Piero Deretta
Post by Nicol Bolas
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
If you need only async I/O - yes, I could imagine that extra
allocation is tolerable in such context. But P0057 describes not
only async I/O - but also for instance generators. And for
generators (like transform iterators) an extra allocation is huge
price.
Post by Evgeny Panasyuk
A price you will never pay because it will be elided.
Please stop repeating statements that have been disproven; it's not
helping your case. You have yet to post an example of a generator
that
Post by Evgeny Panasyuk
would not be elided.
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.
Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
*impossible* to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or anything
else, it simply *cannot happen*.
It is trivially possible. The name is unimportant, only the type is. In
this example a lambda stands for an unnamed type. I could have used other
unnamed types.
[...]
OK, yes you can do that. My mistake.
However, that code is not a complete example. It doesn't match with your
sample code (which currently uses type erasure/chicanery). So what does the
non-trick version look like?
Note that I'm not the original poster.

You write the same code directly inside createACoroutineVector. No need for
the extra function call, createACoroutine was purely an example.

-- gpd
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-15 00:21:50 UTC
Permalink
Post by Evgeny Panasyuk
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
<http://coliru.stacked-crooked.com/a/0c09744abd5e57ae>
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.
No need for type-erasure - no need for N allocations.
Post by Evgeny Panasyuk
Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
/impossible/ to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or
anything else, it simply /cannot happen/.
Again, I am not talking specifically about P0114r0. I am talking about
at least adding possibility to have concrete types in P0057R0-like proposal.

For instance, here:

generator<int> numbers()
{
yield 1;
}

we can use "numbers" as a name for synthesized class (which represents
concrete coroutine), instead of name for synthesized function.
Post by Evgeny Panasyuk
What you're suggesting is impossible. Or at least, it's impossible
without trickery (ie: macros).
It is possible without any trickery. It is just transformation of
function-like code into class which has name of that function-like entity.
Post by Evgeny Panasyuk
So Boost.Asio is either doing type erasure or it is cheating. Any core
feature will not be allowed to cheat, so you'll /have/ to use type
erasure to store the result of such an operation.
There is no cheating and type erasure. We just can use name given by user.
Post by Evgeny Panasyuk
The only way you could avoid a dynamic allocation while still leaving
the lifetime of the calling function is if you could copy/move the
coroutine type. And that's just not reasonable.
First of all, I would like to have copy and move semantics, at least
some explicit control for it.

Anyway, even if we would not have copy and move semantics, yes - there
will be some allocation when needed to leave lifetime of calling
function, but there will be less allocations than in proposed P0057R0.

For instance make_unique<concrete_coroutine[]>(N) is just one
allocation, instead of N+1.

Another example is

struct Widget { concrete_coroutine x; };
make_unique<Widget>()

This is also one allocation, while P0057R0 would result in two
allocations - one for Widget itself and another for coroutine.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
g***@hubblehome.com
2015-10-16 03:41:54 UTC
Permalink
Post by Evgeny Panasyuk
First of all, I would like to have copy and move semantics, at least
some explicit control for it.
+1
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-16 15:52:28 UTC
Permalink
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
I already described it several times, just put generator into some
structure/array or return somewhere.
http://coliru.stacked-crooked.com/a/0c09744abd5e57ae
<http://coliru.stacked-crooked.com/a/0c09744abd5e57ae>
- allocation of P0057 generator will not be elided, there will be N
allocations, i.e. for each coroutine.
And in your case, the `Coroutine` will have to type-erase them too. Thus
performing N allocations.
No need for type-erasure - no need for N allocations.
Post by Evgeny Panasyuk
Put it another way. In order to make something a member of a struct, you
must first be able to name it. In C++ as it currently stands, it is
/impossible/ to store an unnamable type in a non-static data member.
Whether it's a lambda or the result of a resumable expression or
anything else, it simply /cannot happen/.
Again, I am not talking specifically about P0114r0. I am talking about
at least adding possibility to have concrete types in P0057R0-like proposal.
Well, it's hard to gauge how reasonable a proposal is when said proposal *doesn't
actually exist*. You don't have a proposal; you just have some general
notions of how you think it ought to act, with no demonstrated knowledge of
how feasible that will be to implement.

And no, Boost.Asio's macro hacks are not a feasibility study.
Post by Evgeny Panasyuk
generator<int> numbers()
{
yield 1;
}
we can use "numbers" as a name for synthesized class (which represents
concrete coroutine), instead of name for synthesized function.
No, you cannot. Why? Well:

generator<int> numbers();

static_assert(std::is_function_v<decltype(numbers)>);

This assert should never fire. Yet you want to *make* it fire.

That's breaking basic rules of C++: a function declaration should be a
function declaration, not a struct declaration. Even lambdas don't look
like non-lambda functions.

So let's skip past that obviously non-functional idea. Let's say that you
allow users to decorate a function definition. Maybe you even use lambda
syntax, since it is similar:

[]numbers() -> generator<int>;

OK, so the compiler sees this and knows that `numbers` is a struct.

How big is it?

The compiler doesn't know. The compiler *cannot know*. Not from the
information presented here. What `numbers` is here is an incomplete type.

The only way to generate a complete type is to complete the function
definition of `numbers`. That way, the alignment and storage of the stack
data is available.

And that means that the function must be inline. Not only that, you can't
have virtual functions use this at all.

So all you've done is *re-invent P0114* with slightly different syntax. For
someone who keeps claiming that their idea isn't P0114, it seems to have a
lot of P0114's *restrictions*.

The beauty of P0057's design is that it works with C++ as it currently
exists. It changes the bare minimum needed to make the feature work. It
doesn't require that resumable functions are inlined or anything like that.
It doesn't make function declarations automatically become struct
declarations.

All of the work for resumable functions happens *within* the function that
is resumable, and external code is none-the-wiser. I can manipulate a
resumable function as normal, I can stick one in a std::function, I can
make it virtual, non-inline, anything. It's just a normal function.

What, did you think P0057 made the decision to type-erase coroutines on a
whim? It is there *specifically* to avoid all of these elements. That
design decision is what allows P0057 to work *generally*. No forced
inlining. Virtual calls are allowed. Resumable functions look and behave
just like any other functions.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-17 11:14:39 UTC
Permalink
Post by Evgeny Panasyuk
Again, I am not talking specifically about P0114r0. I am talking about
Post by Evgeny Panasyuk
at least adding possibility to have concrete types in P0057R0-like proposal.
Well, it's hard to gauge how reasonable a proposal is when said proposal *doesn't
actually exist*. You don't have a proposal; you just have some general
notions of how you think it ought to act, with no demonstrated knowledge of
how feasible that will be to implement.
Having just some feature proposal and some implementation is not enough
reason to standardize. Proposed feature must also fit main design goals of
language.
I am showing major flow in existing proposal. Do you think it should not be
discussed?
I agree that the best way is to make full-featured proposal, but first I
want to discuss it here.
Post by Evgeny Panasyuk
And no, Boost.Asio's macro hacks are not a feasibility study.
These hacks allow us to investigate possible paths with low efforts.
Post by Evgeny Panasyuk
That's breaking basic rules of C++: a function declaration should be a
function declaration, not a struct declaration. Even lambdas don't look
like non-lambda functions.
I agree, if it would be not function - it should not pretend to be a
function.
Post by Evgeny Panasyuk
OK, so the compiler sees this and knows that `numbers` is a struct.
How big is it?
The compiler doesn't know. The compiler *cannot know*. Not from the
information presented here. What `numbers` is here is an incomplete type.
The only way to generate a complete type is to complete the function
definition of `numbers`. That way, the alignment and storage of the stack
data is available.
And that means that the function must be inline.
Yes, if someone want coroutine with concrete type - it's body should be
visible to compiler at point of usage. If type-erasure is OK - then body
can be hidden it completely. And this is orthogonal to syntax issues.
Post by Evgeny Panasyuk
So all you've done is *re-invent P0114* with slightly different syntax.
For someone who keeps claiming that their idea isn't P0114, it seems to
have a lot of P0114's *restrictions*.
P0114 sets much more ambitious goal, it tries to merge/fuse several stack
frames into one coroutine.
Post by Evgeny Panasyuk
The beauty of P0057's design is that it works with C++ as it currently
exists. It changes the bare minimum needed to make the feature work. It
doesn't require that resumable functions are inlined or anything like that.
It doesn't make function declarations automatically become struct
declarations.
P0057 can be changed to allow concrete coroutine types with very small
syntax modifications. For instance:
// Type erasure version:
generator<int> numbers(int x)
{
yield x;
...
}
// Compiler uses std::coroutine_triats<generator<int>, int> (as it is
proposed in P0057)
/**************************/

// Version with concrete coroutine type:
auto numbers(int x, concrete_generator_tag = 0)
{
yield x;
...
}
// Compiler uses coroutine_triats<auto_result_tag, int,
concrete_generator_tag> , and based on this specialization it will generate
return type and value.
// Type of coroutine is decltype(numbers(1)), i.e.:
decltype(numbers(1)) coro = numbers(1);
This approach is even closer to P0057.
Post by Evgeny Panasyuk
What, did you think P0057 made the decision to type-erase coroutines on a
whim? It is there *specifically* to avoid all of these elements. That
design decision is what allows P0057 to work *generally*. No forced
inlining. Virtual calls are allowed. Resumable functions look and behave
just like any other functions.
Type-erasure can be optional, it should not be the only way.

One of main use cases for coroutines are generators, even authors of P0057
refer frequently this use-case. Many languages which has some kind of
stackless coroutines start with support of this uses case, like Python/C#
yield.

While mandatory type-erasure can be tolerated in cases like async I/O, but
for generators it adds huge overhead, which can't be tolerated for C++.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Ville Voutilainen
2015-10-14 05:36:26 UTC
Permalink
Post by Evgeny Panasyuk
Post by Gor Nishanov
The starting point of my design was a lambda* with the properties you
describe. When applied to problems I needed solving I found it
unsatisfactory and therefore went with N4134 proposal. That does not mean
that at some point, somebody won't be able to invent a better lambda* and
get it standardized.
Well, this is also the problem. If we already would have some stackless
coroutines in ISO - it would be much harder to get additional one into it.
That "much harder" is fairly questionable. While there apparently are people
who think working on e.g. stackful coroutines becomes a pointless exercise
if Gor's proposal is accepted, and while there are people who prefer
picking one rather than many solutions (albeit the problems being different),

- we have plenty of people in the committee who understand that the different
solutions are tackling different problems,
- for those who don't understand it, the proposal authors can fairly easily
re-explain it
- while they are doing that re-explaining, Gor is going to nod vigorously
and help them explain it, even though it's not the main focus of his
attention as far as the overall design space of these facilities goes,

so, compared to almost any other facility that would provide an
alternative/additional
approach for something already partially tackled by a standard facility,
standardizing something additional in this area may be much easier than people
fear.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 06:02:37 UTC
Permalink
Post by Ville Voutilainen
Post by Evgeny Panasyuk
Post by Gor Nishanov
The starting point of my design was a lambda* with the properties you
describe. When applied to problems I needed solving I found it
unsatisfactory and therefore went with N4134 proposal. That does not mean
that at some point, somebody won't be able to invent a better lambda* and
get it standardized.
Well, this is also the problem. If we already would have some stackless
coroutines in ISO - it would be much harder to get additional one into it.
That "much harder" is fairly questionable. While there apparently are people
who think working on e.g. stackful coroutines becomes a pointless exercise
if Gor's proposal is accepted, and while there are people who prefer
picking one rather than many solutions (albeit the problems being different),
Indeed, there is overlap between use cases of stackless and stackful
coroutines. But in addition, each one has it's own area where it beats
another approach.

And I think C++ ISO needs both - stackless and stackful coroutines.
(though I would prefer to get stackless into ISO first, because stackful
can be completely implemented in library, like Boost.Context/Coroutine)

But here is another situation - both options aim to exactly same use
cases, and even one is strictly more powerful than another. If we would
have stackless coroutine with concrete type - then we always can
implement type-erasure on top of it. Just like we have std::function
type-erasure for concrete lambda types.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Ville Voutilainen
2015-10-14 07:42:51 UTC
Permalink
Post by Evgeny Panasyuk
Post by Ville Voutilainen
That "much harder" is fairly questionable. While there apparently are people
who think working on e.g. stackful coroutines becomes a pointless exercise
if Gor's proposal is accepted, and while there are people who prefer
picking one rather than many solutions (albeit the problems being different),
Indeed, there is overlap between use cases of stackless and stackful
coroutines. But in addition, each one has it's own area where it beats
another approach.
Correct.
Post by Evgeny Panasyuk
And I think C++ ISO needs both - stackless and stackful coroutines. (though
I would prefer to get stackless into ISO first, because stackful can be
completely implemented in library, like Boost.Context/Coroutine)
Fully agreed.
Post by Evgeny Panasyuk
But here is another situation - both options aim to exactly same use cases,
and even one is strictly more powerful than another. If we would have
stackless coroutine with concrete type - then we always can implement
type-erasure on top of it. Just like we have std::function type-erasure for
concrete lambda types.
The use cases may be slightly different, because stackful coroutines
do not require an Awaitable type all through the call stack, whereas
stackless coroutines do. The use case may be the same, but which
facility to apply depends on other less-technical-things, like whether
the user of a coroutine controls the full call stack.

As far as having the concrete type goes, that sounds like it requires even
more inlining and across-call-stack transparency. In a stackless coroutine,
the erased type combined with elision of the erasure and allocations avoids
having all coroutines have a different type.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 21:09:05 UTC
Permalink
Post by Ville Voutilainen
As far as having the concrete type goes, that sounds like it requires even
more inlining and across-call-stack transparency.
It actually requires less inlining and transparency. For instance, this
code:
future<int> concrete_coroutine()
{
int local = await async_operation();
return local;
}
Can be straightforwardly transformed to something like:
struct concrete_coroutine
{
state_value_type current_state;
int local;

future<int> method_state_machine(); // or operator()()
};
Where method_state_machine can be compiled separately, in another
translation unit.
And actually this approach is already implementable with macros, to some
extent (it works, but compiler-side transformation will give better result).
Post by Ville Voutilainen
In a stackless coroutine,
the erased type combined with elision of the erasure and allocations avoids
having all coroutines have a different type.
Yes, but it is easy to get erased type from concrete when needed.
For instance different lambdas have different types, but can be easily
placed into std::function (if has appropriate signature).
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Ville Voutilainen
2015-10-14 21:18:18 UTC
Permalink
Post by Evgeny Panasyuk
Post by Ville Voutilainen
As far as having the concrete type goes, that sounds like it requires even
more inlining and across-call-stack transparency.
It actually requires less inlining and transparency. For instance, this
future<int> concrete_coroutine()
{
int local = await async_operation();
return local;
}
struct concrete_coroutine
{
state_value_type current_state;
int local;
future<int> method_state_machine(); // or operator()()
};
Where method_state_machine can be compiled separately, in another
translation unit.
And actually this approach is already implementable with macros, to some
extent (it works, but compiler-side transformation will give better result).
Where does this transformation happen translation-unit-wise, and how
would the method_state_machine get compiled in a different translation
unit? What type does the caller of the previous concrete_coroutine() see?
Post by Evgeny Panasyuk
Post by Ville Voutilainen
In a stackless coroutine,
the erased type combined with elision of the erasure and allocations avoids
having all coroutines have a different type.
Yes, but it is easy to get erased type from concrete when needed.
For instance different lambdas have different types, but can be easily
placed into std::function (if has appropriate signature).
For some values of "easily". For the many users who don't care about the
underlying type of the coroutine, it's not so easy when they have to wrap every
time they use a coroutine.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 21:46:34 UTC
Permalink
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
Post by Ville Voutilainen
As far as having the concrete type goes, that sounds like it requires
even
Post by Evgeny Panasyuk
Post by Ville Voutilainen
more inlining and across-call-stack transparency.
It actually requires less inlining and transparency. For instance, this
future<int> concrete_coroutine()
{
int local = await async_operation();
return local;
}
struct concrete_coroutine
{
state_value_type current_state;
int local;
future<int> method_state_machine(); // or operator()()
};
Where method_state_machine can be compiled separately, in another
translation unit.
And actually this approach is already implementable with macros, to some
extent (it works, but compiler-side transformation will give better
result).
Where does this transformation happen translation-unit-wise, and how
would the method_state_machine get compiled in a different translation
unit?
This is a good point. Looks like such transformation should happen in each
translation unit which uses it - in order to deduce size of structure
(maybe not full code generation, just analysis of locals). Method itself
can be compiled only in one of translation units using mechanism similar to
extern and explicit instantiation.
But my point still holds, this method can be not inlined (in optimizer
sense) and still produce zero allocations.
Post by Evgeny Panasyuk
What type does the caller of the previous concrete_coroutine() see?
What do you mean? Which "previous"?
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
Post by Ville Voutilainen
In a stackless coroutine,
the erased type combined with elision of the erasure and allocations avoids
having all coroutines have a different type.
Yes, but it is easy to get erased type from concrete when needed.
For instance different lambdas have different types, but can be easily
placed into std::function (if has appropriate signature).
For some values of "easily". For the many users who don't care about the
underlying type of the coroutine, it's not so easy when they have to wrap every
time they use a coroutine.
It can be done even without explicit wrapping, but just relying on
different coroutine_traits specializations. One trait may give concrete
coroutine type, and another can erase concrete and give erased type to user.
For instance:
concrete_generator<int> cg1()
{
yield 1;
}

concrete_generator<int> cg2()
{
yield 2;
}
Here cg1 and cg2 are different types.

But here:
type_erased_generator<int> teg1()
{
yield 1;
}

type_erased_generator<int> teg2()
{
yield 2;
}
teg1 and teg2 would have same type.

User do not have to wrap manually cg1 and cg2 (but he can do this also) -
instead he may use type_erased_generator from the start, and it will do
type erasure itself via coroutine_traits mechanism.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Ville Voutilainen
2015-10-14 21:53:39 UTC
Permalink
Post by Evgeny Panasyuk
Post by Ville Voutilainen
What type does the caller of the previous concrete_coroutine() see?
What do you mean? Which "previous"?
You described how

future<int> concrete_coroutine()

is supposedly transformed. I don't know what that transformation does
from the point of view of the caller of concrete_coroutine.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 22:32:26 UTC
Permalink
Post by Ville Voutilainen
Post by Evgeny Panasyuk
Post by Ville Voutilainen
What type does the caller of the previous concrete_coroutine() see?
What do you mean? Which "previous"?
You described how
future<int> concrete_coroutine()
is supposedly transformed. I don't know what that transformation does
from the point of view of the caller of concrete_coroutine.
At low level it gives type of coroutine with several methods like resume
and is_terminated. And can be used like:
void test()
{
concrete_coroutine coro{};
future<int> f = coro.resume(); // or coro()
}

_______________
Another example:
concrete_low_level_generator<int> positive_numbers(int N)
{
for(int x=1; x<=N; ++x)
yield x;
}

void test()
{
positive_numbers xs{100};
while(xs.resume())
print(xs.current_value());
}
This example is very similar to one described at page 14 of p0057r0.
Coroutine traits may provide higher level abstractions on top of this, like
give type which behaves like a range:
concrete_high_level_generator<int> positive_numbers(unsigned N)
{
for(int x=1; x<=N; ++x)
yield x;
}

void test()
{
for(auto x : positive_numbers{100});
print(x);
}
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Oliver Kowalke
2015-10-14 08:39:58 UTC
Permalink
Post by Evgeny Panasyuk
And I think C++ ISO needs both - stackless and stackful coroutines.
(though I would prefer to get stackless into ISO first, because stackful
can be completely implemented in library, like Boost.Context/Coroutine)
but implementing context switching is cumbersome (because assembler) -
better the compiler vendors provide implementations for all those
combinations of architecture + ABI + binary format
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 04:46:50 UTC
Permalink
Post by Evgeny Panasyuk
Post by Nicol Bolas
Except that he's already proven (in this thread no less) that a
good optimizer can elide the allocation. If the compiler can
reasonably /make/ it zero overhead, then it /is/ zero overhead.
1. It is impossible (practically) in general case.
|
vector<coroutine>x(N);
|
In case of coroutines with concrete types and sizeof known at
compile - this can be done within single allocation.
But if coroutine type is erased the we will have N+1 allocations in
general case - it can't be practically elided.
Ignoring the rest of the discussion on this point, I never claimed that
P0057 could guarantee elision in the case you present here. Before, you
asked about a /specific/ problem, and I answered with a specific example
showing that it was elidable. What you've shown here hardly disproves my
point.
It is not zero overhead even with good optimizer/compiler, because they
can't elide every allocation, and I am not talking about some exotic cases.
Thus far, including in this post, you haven't mentioned an example that
would actually compile.

You linked to some macro code, but macros are, basically, *cheating*. They
get to break all kinds of C++ rules, which an actual language feature would
not.
Post by Evgeny Panasyuk
Post by Nicol Bolas
Also... how does `vector<coroutine>` make any kind of sense with regard
to P0114? The type isn't type erased, so each coroutine has its own
type. Therefore, in order to put them in a homogeneous container like
`vector`, you'll have to type-erase them. Which requires memory
allocation.
Post by Nicol Bolas
At which point, your version gains /nothing/ over P0057.
Same coroutines have same concrete types. For instance, with P0114 it
|
struct concrete_coroutine
{
resumable auto r = expression;
// ...
};
...
make_unique<concrete_coroutine[]>(N);
|
`auto` doesn't work that way. Non-static data members cannot be `auto`.
Normally I wouldn't care about a small issue like that, but it basically
makes your code impossible.

Without `auto` NSDMI (and I wouldn't hold my breath on seeing it
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3897.html>), you
can't store a resumable expression. So you can't make containers of them.

Unless you erase their types. So again, you've gained *nothing*.

Your macro solution gets around this because it uses macros.
Post by Evgeny Panasyuk
Again, I am not talking specifically about P0114. Even P0057 can be
changed to have concrete coroutine type.
I'd be curious to see how, exactly.

And I don't mean some macro nonsense. I mean the specific details of how
you turn P0057's `coroutine_handle` into a type.

See, the way P0057 works is that the coroutine object is introduced in one
specific place: the awaiter object's `await_suspend` method. That's the
first place where user code gets to touch a coroutine, and if the method
doesn't store it or otherwise keep it around, it's also the last.

That's what I meant when I said "adding a template". Because now,
`await_suspend` *must* become a template function. There's no other way to
capture a parameter of an arbitrary, compiler-generated type.

But what of generators and promises in such a scenario? A `generator<int>`
needs to be able to store any coroutine, so it has to... type erase it. The
promise type cannot be a template on the coroutine type, since it was
declared long before the coroutine handle appeared. And so forth.

In short, the entirety of P0057 is designed around a type-erased
`coroutine_handle`. You can't simply declare that it's not type-erased and
expect everything to work reasonably. The entire design would need to be
rethought.

If you want this done, then you're going to need to go through the effort
of designing the feature to work without type erasure. Then you have to get
someone to implement it. Then, you can know whether it works just as well
as P0057, whether it's equally easy to use, and how much of a performance
advantage it gets.

If any.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 18:29:22 UTC
Permalink
Post by Nicol Bolas
Thus far, including in this post, you haven't mentioned an example that
would actually compile.
Actually examples do compile and do run.
Post by Nicol Bolas
You linked to some macro code, but macros are, basically, *cheating*.
They get to break all kinds of C++ rules, which an actual language feature
would not.
Macros allow us to emulate language feature, to test it now, with current
compilers. Even Stroustrup uses macros in Mach7 library to emulate language
feature.
I think it is obvious that following macro-based code:
COROUTINE(vector<int>, list_demo, (int, param),
(int, local_x)
(int, local_y))
{
AWAIT(local_x =) vector<int>{1,2,3};
AWAIT(local_y =) vector<int>{10, 20, 30};

RETURN(local_x + local_y + param);
}
COROUTINE_END;

Is equivalent to following code with language support:

vector<int> list_demo(int param)
{
int local_x = await vector<int>{1,2,3};
int local_y = await vector<int>{10, 20, 30};

return local_x + local_y + param;
}

And if macro-based version does work, then this one will work without
problems.
Post by Nicol Bolas
Post by Nicol Bolas
Post by Nicol Bolas
Also... how does `vector<coroutine>` make any kind of sense with regard
to P0114? The type isn't type erased, so each coroutine has its own
type. Therefore, in order to put them in a homogeneous container like
`vector`, you'll have to type-erase them. Which requires memory
allocation.
Post by Nicol Bolas
At which point, your version gains /nothing/ over P0057.
Same coroutines have same concrete types. For instance, with P0114 it
|
struct concrete_coroutine
{
resumable auto r = expression;
// ...
};
...
make_unique<concrete_coroutine[]>(N);
|
`auto` doesn't work that way. Non-static data members cannot be `auto`.
Normally I wouldn't care about a small issue like that, but it basically
makes your code impossible.
Such usage of auto is at p0114r0.pdf at page 11.
Post by Nicol Bolas
Without `auto` NSDMI (and I wouldn't hold my breath on seeing it
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3897.html>),
you can't store a resumable expression. So you can't make containers of
them.
Unless you erase their types. So again, you've gained *nothing*.
Your macro solution gets around this because it uses macros.
I showed two versions above - macro based version and possible syntax with
language support.
Do you have any concrete reasoning why it would be impossible without
macros?
Post by Nicol Bolas
Again, I am not talking specifically about P0114. Even P0057 can be
Post by Nicol Bolas
changed to have concrete coroutine type.
I'd be curious to see how, exactly.
Currently it works like this:
struct generator
{
...
coroutine_handle<promise_type> coro;
};

generator example_generator()
{
yield 1;
}

int main()
{
generator x = example_generator();
x.move_next();
g.current_value();
}

With concrete coroutine type it could be something like this:

template<template<typename> class coroutine_value>
struct generator
{
...
coroutine_value<promise_type> coro;
};

generator example_generator()
{
yield 1;
}
// example_generator is transformed to:
using example_generator = generator< synthesized_coroutine >;

int main()
{
example_generator x{};
x.move_next();
g.current_value();
}
Post by Nicol Bolas
If you want this done, then you're going to need to go through the effort
of designing the feature to work without type erasure. Then you have to get
someone to implement it.
It could be implemented even with macros, to some extent. And I think
macro-based solution is enough for proof-of-concept.
Post by Nicol Bolas
Then, you can know whether it works just as well as P0057, whether it's
equally easy to use, and how much of a performance advantage it gets.
If any.
Of course it gives performance advantage, because does not impose extra
mandatory allocation.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 19:35:49 UTC
Permalink
Post by Evgeny Panasyuk
Post by Nicol Bolas
Thus far, including in this post, you haven't mentioned an example that
would actually compile.
Actually examples do compile and do run.
Post by Nicol Bolas
You linked to some macro code, but macros are, basically, *cheating*.
They get to break all kinds of C++ rules, which an actual language feature
would not.
Macros allow us to emulate language feature, to test it now, with current
compilers. Even Stroustrup uses macros in Mach7 library to emulate language
feature.
COROUTINE(vector<int>, list_demo, (int, param),
(int, local_x)
(int, local_y))
{
AWAIT(local_x =) vector<int>{1,2,3};
AWAIT(local_y =) vector<int>{10, 20, 30};
RETURN(local_x + local_y + param);
}
COROUTINE_END;
vector<int> list_demo(int param)
{
int local_x = await vector<int>{1,2,3};
int local_y = await vector<int>{10, 20, 30};
return local_x + local_y + param;
}
And if macro-based version does work, then this one will work without
problems.
I'll talk about this more later, but a good language feature should be
*minimal*, not do whatever it takes. That's why a macro approach is a bad
idea for a proposal. It's fine for a general sketch. But macros make you
brave; you can do anything with them.

When it comes to a language feature, you shouldn't do *anything*. You
should do just enough, and no more.

Same coroutines have same concrete types. For instance, with P0114 it
Post by Evgeny Panasyuk
Post by Nicol Bolas
Post by Evgeny Panasyuk
|
struct concrete_coroutine
{
resumable auto r = expression;
// ...
};
...
make_unique<concrete_coroutine[]>(N);
|
`auto` doesn't work that way. Non-static data members cannot be `auto`.
Normally I wouldn't care about a small issue like that, but it basically
makes your code impossible.
Such usage of auto is at p0114r0.pdf at page 11.
True, but that doesn't make it *correct*. C++14 doesn't let `auto` do that;
the standard is very clear on that. And P0114 does not actually propose
allowing `auto` to do that.

All you've shown is that P0114 is in error.

Without `auto` NSDMI (and I wouldn't hold my breath on seeing it
Post by Evgeny Panasyuk
Post by Nicol Bolas
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3897.html>),
you can't store a resumable expression. So you can't make containers of
them.
Unless you erase their types. So again, you've gained *nothing*.
Your macro solution gets around this because it uses macros.
I showed two versions above - macro based version and possible syntax with
language support.
Do you have any concrete reasoning why it would be impossible without
macros?
Because you'd have to get language support for `auto` in NSDMI's. And I
just linked to you a discussion about precisely that and how it's *not
gonna happen*. So your "possible syntax with language support" doesn't hold
water.

Again, I am not talking specifically about P0114. Even P0057 can be
Post by Evgeny Panasyuk
Post by Nicol Bolas
Post by Evgeny Panasyuk
changed to have concrete coroutine type.
I'd be curious to see how, exactly.
struct generator
{
...
coroutine_handle<promise_type> coro;
};
generator example_generator()
{
yield 1;
}
int main()
{
generator x = example_generator();
x.move_next();
g.current_value();
}
template<template<typename> class coroutine_value>
struct generator
{
...
coroutine_value<promise_type> coro;
};
generator example_generator()
{
yield 1;
}
using example_generator = generator< synthesized_coroutine >;
int main()
{
example_generator x{};
x.move_next();
g.current_value();
}
Um, what does that code mean? Where does `synthesized_coroutine` come from?
How does `example_generator` get defined twice? And how does
`example_generator` return a template that has no template arguments?

A nice thing about resumable functions is that it doesn't take a
sledgehammer to basic elements of the language. If a coroutine function
returns a type, it *returns that type*, and the return value has all the
rights and behaviors of a return value from a regular function.

With P0057, C++ works as normal, except where absolutely *necessary*.

What you're suggesting requires a bunch of different changes to lots of
elements of C++. You have to be able to return a template with no
arguments, who's arguments are provided by that `using` declaration, I
guess. And that the argument has to be able to be generated from...
whatever `synthesized_coroutine` is. And so on.

That's a huge amount of work to do just to avoid type erasure. And not just
library work; that's *core language* work. Lots of it.

After all, there's no proposal even *remotely* like this at present. Even
your idea above is incomplete, as it's not clear what all of those pieces
actually mean or do (P0057 makes `await` mean one thing. What does it mean
in your idea?). You have one general notion: coroutines having a firm type.
And you're ready and willing invent a plethora of subsidiary C++ language
features that exist for the sole purpose of making that work.

That's not a good way to make a solid proposal. If that one thing requires
so many subsidiary language features... maybe that one thing is not worth
it.

Even if we accept that this is a good way to make a proposal... it's not a
*proposal* yet. It's just some ideas being batted around on a forum. None
of the various coroutine proposals do anything like what you've suggested.
Why should we halt or delay progress on P0057 because you *think* you might
be able to do better?

I hate to use this phrase as a way to win arguments, but "perfect is the
enemy of good".
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-14 20:14:30 UTC
Permalink
Post by Evgeny Panasyuk
And if macro-based version does work, then this one will work without
Post by Evgeny Panasyuk
problems.
I'll talk about this more later, but a good language feature should be
*minimal*, not do whatever it takes. That's why a macro approach is a bad
idea for a proposal. It's fine for a general sketch. But macros make you
brave; you can do anything with them.
When it comes to a language feature, you shouldn't do *anything*. You
should do just enough, and no more.
I don't do "anything" here. And I don't see that it requires "anything".
Syntax is very similar to what P0057 proposes.
Post by Evgeny Panasyuk
Without `auto` NSDMI (and I wouldn't hold my breath on seeing it
Post by Evgeny Panasyuk
Post by Nicol Bolas
<http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3897.html>),
you can't store a resumable expression. So you can't make containers of
them.
Unless you erase their types. So again, you've gained *nothing*.
Your macro solution gets around this because it uses macros.
I showed two versions above - macro based version and possible syntax
with language support.
Do you have any concrete reasoning why it would be impossible without
macros?
Because you'd have to get language support for `auto` in NSDMI's. And I
just linked to you a discussion about precisely that and how it's *not
gonna happen*. So your "possible syntax with language support" doesn't
hold water.
Again, here I am not talking about P0114. Example code above is much more
closer to P0057 than to P0114. And it does not requires ''auto in NSDMI" -
it is clearly seen from code.
Post by Evgeny Panasyuk
Post by Evgeny Panasyuk
template<template<typename> class coroutine_value>
struct generator
{
...
coroutine_value<promise_type> coro;
};
generator example_generator()
{
yield 1;
}
using example_generator = generator< synthesized_coroutine >;
int main()
{
example_generator x{};
x.move_next();
g.current_value();
}
Um, what does that code mean? Where does `synthesized_coroutine` come from?
"using" part is done by compiler, synthesized_coroutine comes from compiler.
Post by Evgeny Panasyuk
How does `example_generator` get defined twice?
It is not defined twice. First one is what user writes, second one ("using"
part) is what compiler does for this code. In essence user code is
transformed into type with name example_generator.
Post by Evgeny Panasyuk
And how does `example_generator` return a template that has no template
arguments?
If you don't like it, it is possible to return type with template inside.
For instance
struct generator
{
template<template<typename> class coroutine_value>
struct apply { ... };
};
A nice thing about resumable functions is that it doesn't take a
Post by Evgeny Panasyuk
sledgehammer to basic elements of the language. If a coroutine function
returns a type, it *returns that type*, and the return value has all the
rights and behaviors of a return value from a regular function.
It is not truly return type. Even P0057 does not have true return type, you
can't return value of that type from body - it just mimics normal function
syntax, but it is not normal function at all, it is just synthetic language
construction.
Post by Evgeny Panasyuk
What you're suggesting requires a bunch of different changes to lots of
elements of C++. You have to be able to return a template with no
arguments, who's arguments are provided by that `using` declaration, I
guess.
No, it does not require changes to lots of C++ elements. In both cases this
is not true function, it is just something with function-like syntax that
defines coroutine.


After all, there's no proposal even *remotely* like this at present. Even
Post by Evgeny Panasyuk
your idea above is incomplete, as it's not clear what all of those pieces
actually mean or do (P0057 makes `await` mean one thing. What does it mean
in your idea?). You have one general notion: coroutines having a firm type.
And you're ready and willing invent a plethora of subsidiary C++ language
features that exist for the sole purpose of making that work.
I do not offer to invent plethora of subsidiary features.
Post by Evgeny Panasyuk
That's not a good way to make a solid proposal. If that one thing requires
so many subsidiary language features... maybe that one thing is not worth
it.
No, this does not requires subsidiary language features.
Post by Evgeny Panasyuk
Even if we accept that this is a good way to make a proposal... it's not a
*proposal* yet. It's just some ideas being batted around on a forum. None
of the various coroutine proposals do anything like what you've suggested.
Why should we halt or delay progress on P0057 because you *think* you
might be able to do better?
If authors of P0057 still would insist on design with intrinsic overhead
and high burden on optimizers, then you are right - probably viable path is
to make another proposal.
Personally I would prefer to get fast coroutines in like 2020, then to get
some coroutines in 2017.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-16 16:26:18 UTC
Permalink
Post by Evgeny Panasyuk
Post by Nicol Bolas
Post by Evgeny Panasyuk
template<template<typename> class coroutine_value>
struct generator
{
...
coroutine_value<promise_type> coro;
};
generator example_generator()
{
yield 1;
}
using example_generator = generator< synthesized_coroutine >;
int main()
{
example_generator x{};
x.move_next();
g.current_value();
}
Um, what does that code mean? Where does `synthesized_coroutine` come from?
"using" part is done by compiler, synthesized_coroutine comes from compiler.
Post by Nicol Bolas
How does `example_generator` get defined twice?
It is not defined twice. First one is what user writes, second one
("using" part) is what compiler does for this code. In essence user code is
transformed into type with name example_generator.
And how does `example_generator` return a template that has no template
Post by Evgeny Panasyuk
Post by Nicol Bolas
arguments?
If you don't like it,
It's not a question of what I like. It is simply *not possible in C++*. You
cannot return a template; you can only return a concrete type. This may be
a specific instantiation of a template, but you cannot return a template
itself.

What you wrote is syntactic nonsense. Therefore, if you want it to stop
being syntactic nonsense, your proposal will need to define what it means.

That's fine, but it is another feature added to your idea. Hence the whole
"plethora of subsidiary features" I was talking about. Every time someone
points out a problem with making coroutines concrete types, you resolve it
by adding another feature to the language. I remind you that it's
impossible to return a template, so you then define how being a coroutine
makes the previously impossible possible.

That's a new feature.

The nice thing about P0057 is that it doesn't have very many new features.
It pretty much stops at function suspend/resume and internally-generated
promise types. The return type of a coroutine is no different from any
other type. The awaiter type, even the promise type are all types using the
C++ rules for types.

Your proposal seems to require a lot of special-case handling at the type
level.

it is possible to return type with template inside. For instance
Post by Evgeny Panasyuk
struct generator
{
template<template<typename> class coroutine_value>
struct apply { ... };
};
OK, so... what does that do? Does `generator` store the coroutine? That
seems more or less impossible, since you have the same problem: a template
parameter for the return type getting filled in by the function returning
it.

Somewhere, there's a variable who's type, an instantiation of a template,
has one of its template arguments get filled in by the compiler. That's a
very new thing that's unlike normal C++ code.

A nice thing about resumable functions is that it doesn't take a
Post by Evgeny Panasyuk
Post by Nicol Bolas
sledgehammer to basic elements of the language. If a coroutine function
returns a type, it *returns that type*, and the return value has all the
rights and behaviors of a return value from a regular function.
It is not truly return type. Even P0057 does not have true return type,
you can't return value of that type from body - it just mimics normal
function syntax, but it is not normal function at all, it is just synthetic
language construction.
OK, internally it may only "mimic normal function syntax", but by design,
the "mimicry" is *complete*. It looks and behaves no different from any
other function. There is absolutely no way to tell the difference between
it and non-coroutine functions.

Your proposal exposes *everyone* to the deep guts of working with a
coroutine. All just for some minor performance gain that in most situations
compilers can optimize out. And even when they can't, *you* can optimize
them out with a decent allocator.

Even if we accept that this is a good way to make a proposal... it's not a
Post by Evgeny Panasyuk
Post by Nicol Bolas
*proposal* yet. It's just some ideas being batted around on a forum.
None of the various coroutine proposals do anything like what you've
suggested. Why should we halt or delay progress on P0057 because you
*think* you might be able to do better?
If authors of P0057 still would insist on design with intrinsic overhead
and high burden on optimizers, then you are right - probably viable path is
to make another proposal.
Passive-aggressiveness does not prove your point. For example, you have yet
to prove that the "burden on optimizers" is "high" by some definition of
that word. It's merely "non-zero".

Just like the burden on optimizers for dealing with template code,
inlining, and the like.

Personally I would prefer to get fast coroutines in like 2020, then to get
Post by Evgeny Panasyuk
some coroutines in 2017.
Considering that you haven't proven that P0057 is particularly slow, I
remain yet unconvinced that the performance gains you claim are necessary
to be "fast" are actually worth those 3 years.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-17 08:52:10 UTC
Permalink
Post by Nicol Bolas
Passive-aggressiveness does not prove your point. For example, you have
yet to prove that the "burden on optimizers" is "high" by some definition
of that word. It's merely "non-zero".

I think there is no need to, well, insult anyone for having a different
view insinuating he is being too agressive. I think it is good to have
discussion.

That said, one of the principles of C++ is the zero-overhead principle. Not
the "little overhead" principle. Why? Because c++ is for max. performance,
and if u do something suboptimal by design, people are going to invent
another solution.

Herb Sutter defined this zero overhead as nothing between c++ that is not
assembly. I agrew with that. P
The only library that does have a design I dnt like, and u mentioned
before, is iostreams. No lib ever followed iostream path since then. We
have templated non-inheritance components that are generic mostly.

Erasure does have costs and there are alternatives. Chris paper mentions
about the inherent erasure overhead. Why it is mentioned if it is not that
important... Erasure cannot be controlled in all scenarios once it is
embedded into the design. That is something that is simply true. So the
question here should be if we can have a design with inherently minimal
overhead, not if you sympathize with one solution or another only.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Evgeny Panasyuk
2015-10-17 11:35:35 UTC
Permalink
Post by Nicol Bolas
It's not a question of what I like. It is simply *not possible in C++*.
You cannot return a template; you can only return a concrete type. This may
be a specific instantiation of a template, but you cannot return a template
itself.
You can return concrete type which has template inside.
Post by Nicol Bolas
What you wrote is syntactic nonsense. Therefore, if you want it to stop
being syntactic nonsense, your proposal will need to define what it means.
Again, it is not ready proposal.
At first I am pointing to flaws of existing proposal, and want to discuss
it.
Post by Nicol Bolas
Your proposal seems to require a lot of special-case handling at the type
level.
No, it does not require much of special-case handling.
Post by Nicol Bolas
it is possible to return type with template inside. For instance
Post by Evgeny Panasyuk
struct generator
{
template<template<typename> class coroutine_value>
struct apply { ... };
};
OK, so... what does that do? Does `generator` store the coroutine?
It describes how to create result type.
Post by Nicol Bolas
OK, internally it may only "mimic normal function syntax", but by design,
the "mimicry" is *complete*. It looks and behaves no different from any
other function. There is absolutely no way to tell the difference between
it and non-coroutine functions.
This is possible (if it is required to mimic function syntax) for concrete
coroutines. We already have "auto" return types on functions - this
mechanism can be used for concrete coroutines. Check my previous message.
Post by Nicol Bolas
Your proposal exposes *everyone* to the deep guts of working with a
coroutine. All just for some minor performance gain that in most situations
compilers can optimize out.
It is not a minor difference. One allocation and virtual/indirect calls for
resumption is a huge overhead for things like generators.
Post by Nicol Bolas
And even when they can't, *you* can optimize them out with a decent
allocator.
First of all, I don't want to use custom allocators for simple things like
generators.
Second, even custom allocator is not zero-overhead - at least it must check
size, because it is not known at compile time.
Post by Nicol Bolas
Even if we accept that this is a good way to make a proposal... it's not a
Post by Evgeny Panasyuk
Post by Nicol Bolas
*proposal* yet. It's just some ideas being batted around on a forum.
None of the various coroutine proposals do anything like what you've
suggested. Why should we halt or delay progress on P0057 because you
*think* you might be able to do better?
If authors of P0057 still would insist on design with intrinsic overhead
and high burden on optimizers, then you are right - probably viable path is
to make another proposal.
Passive-aggressiveness does not prove your point.
Constantly calling things "nonsense" things you do not like - does not
prove your point either.
Post by Nicol Bolas
For example, you have yet to prove that the "burden on optimizers" is
"high" by some definition of that word. It's merely "non-zero".
Just like the burden on optimizers for dealing with template code,
inlining, and the like.
Well, I agree. I think that I should make detailed report on this issue.
Showing concrete aspects of overhead, showing what compilers can do today,
etc.
Post by Nicol Bolas
Personally I would prefer to get fast coroutines in like 2020, then to get
Post by Evgeny Panasyuk
some coroutines in 2017.
Considering that you haven't proven that P0057 is particularly slow, I
remain yet unconvinced that the performance gains you claim are necessary
to be "fast" are actually worth those 3 years.
I already made some tests, checked ASM for both versions, etc. I should do
more tests and then create some kind of report based on this.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
German Diago
2015-10-14 07:14:47 UTC
Permalink
Post by Nicol Bolas
P0114 *requires* that all resumable functions you call are inlined. If
they're not inlined, you have to manually box them (and the boxing function
is no longer resumable). Boxing involves type erasure. And as previously
stated, memory allocation.
You can box it. You *can*, you do not *need*. And you do not need to have
an inline resumable expression if you just implement and type-erase in a
.cpp file. You still have the same amount of power as your proposal, just
you have the *additional* option of inlining.
I think this alone makes already that proposal inherently "more
zero-overhead".
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
t***@gmail.com
2015-10-14 11:50:33 UTC
Permalink
Post by Gor Nishanov
I have had an outstanding challenge for a year already to anyone who
thinks that way to come up with a real world problem, reduce it to
managable size (say async_tcp_reader) write it up it both ways using P0057
1) How much code end-user have to write
2) How much library support required
3) What is an abstraction penalty, how many instructions need to get
executed to get from, say, await Read(buf, len) to an low-level
API/hardware, say WSARecv
My statement is that P0057 is as good or better on all 3 criteria than any
other proposal I've seen. If you want to accept the challenge, write up an
That is a good way forward. I think the abstraction penalty should be the
same, otherwise the "resumable" proposal is dead. Given that, I don't agree
that the amount of code is the most important aspect. What matters here is
that normal programmers should be able to write correct programs without
subtle bugs. I don't mind writing a little more code, if the resulting code
is easier to get correct.

kind regards

Thorsten
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 13:37:50 UTC
Permalink
Post by Gor Nishanov
3) What is an abstraction penalty, how many instructions need to get
Post by Gor Nishanov
executed to get from, say, await Read(buf, len) to an low-level
API/hardware, say WSARecv
My statement is that P0057 is as good or better on all 3 criteria than
any other proposal I've seen. If you want to accept the challenge, write up
That is a good way forward. I think the abstraction penalty should be the
same,
I really hoped that one of the proponents will do the exercise and reach
the same conclusion as I did. Namely that P0114 has significantly higher
overhead measured in hundreds more instructions, memory barriers, type
erasure and memory allocations.

If you look at
http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/p0055r0.html, it shows
how you can automate creation of awaitables for template based libraries
dealing with async I/O that gets you code equivalent to hand written
assembly.

ResultType r = await async_xyz(p);

becomes

async_xyz`Awaiter __tmp{p};
$promise.resume_addr = &__resume_label; // save the resumption point of the coroutine
__tmp.resume = $RBP; // inlined await_suspend
os_xyz(p,&OsContextBase::Invoke, &__tmp); // inlined await_suspend
jmp Epilogue; // suspends the coroutine
__resume_label: // will be resumed at this point once the operation is finished
R r = move(__tmp.result); // inlined await_resume


which is pretty much what you would have written by hand if you would write
your coroutine in assembly.
Compare that to "can do it in a library" approach.

First, look at heavy-weight await emulation machinery described in 12.5 of
P0114R0 starting on page 21.

But, let's say, await is not a natural pattern for P0114, that is why it is
heavy weight. Let's look at what could be more natural pattern, shown in:
https://github.com/chriskohlhoff/resumable-expressions/blob/master/examples/await4.cpp
. Same situtation, not as bad as with await emulation, but, still bad.

Though, I think that this is fixable. P0114RX can get syntactic sugar
equivalent of await and yield, it would not need to do as much work in the
library. So it will become as efficient P0057.

What remains are various inconveniences of how coroutines are presented to
the user.
You need to do manual type erasure, like in this example (await4.cpp):

resumable void echo(tcp::socket socket);

resumable void listen(tcp::acceptor acceptor) {
...
spawn([s = std::move(socket)]() mutable { echo(std::move(s)); });

When I say manual type erasure, I mean that instead of just calling echo,
you need to wrap it in a lambda and give to type-erasing library helper
spawn that does wrapping and allocation for you.

Another limitation is that you have to put everything in one file to the
code above to work, since in order for listen to be able to use echo in
resumable expression, it needs to see the body of the echo, thus, it needs
to be defined in the same TU. If you want to put them in different files
you need to create little type erasing wrappers, as in:

echo.h:
void echo(tcp::socket socket);

echo.cpp:

resumable echo_impl(tcp::socket socket) { ... }

void echo(tcp::socket socket) {
spawn([s = std::move(socket)]() mutable { echo_impl(std::move(s)); });
}

I am not so much critisizing Chris, as critisizing myself. I went through
this design already when exploring lambda* nearly two years ago. I reached
a conclusion that for the problems I tried to apply it to, it did not offer
anything to compensate for the complexity compared to boring C# like await
syntax. Hence, I no longer pursue this approach.

It does not mean that for *some* problems, some incarnation of lambda*
*might* be better than P0057. That is wonderful. When this happen, let's
add it in, in addition to P0057 and P0099 (modestly called "A low-level API
for stackful context switching").
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-14 13:59:33 UTC
Permalink
Post by Gor Nishanov
It does not mean that for *some* problems, some incarnation of lambda*
*might* be better than P0057. That is wonderful. When this happen, let's
add it in, in addition to P0057 and P0099 (modestly called "A low-level API
for stackful context switching").
There are two problems with the "let's add it in" approach.

First, how do you teach when to use which? For stackful vs. stackless, it's
quite easy. You usually know when you genuinely want a stack, and emulating
that with stackless (as I discovered) becomes amazingly difficult very
quickly. Similarly, stackful makes it very apparent that creating an
execution_context is allocating memory, so it's not cheap (though I was
surprised to see that Boost.Context's switching only cost ~50 cycles or so).

So with P0114 in the mix, how do you tell people when to use which? Do you
use P0114 when you want stackless that can go farther than one or two
functions? Is there some simple guideline you can tell beginners about when
to use which tool?

The second problem is interoperation. How P0099 and P0057 interoperate is
pretty obvious. How P0114 would interop with P0057 is... less obvious. What
happens if you `break resume` in an awaitable? Can you use `await` in a
resumable function? What madness does `await resumable <expr>` accomplish?

Now, they may have obvious interactions. I haven't gone through a detailed
analysis of both. But it is disconcerting.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-14 14:12:13 UTC
Permalink
Post by Nicol Bolas
Post by Gor Nishanov
It does not mean that for *some* problems, some incarnation of lambda*
*might* be better than P0057. That is wonderful. When this happen, let's
add it in, in addition to P0057 and P0099 (modestly called "A low-level API
for stackful context switching").
There are two problems with the "let's add it in" approach.
First, how do you teach when to use which?
The second problem is interoperation. How P0114 would interop with P0057?
I need to learn to be more direct. The answers to your questions are in two
"some" I used in the sentence you quoted.
First *some*, means, that there must be a problem which gives a lambda* a
clear benefit over P0057. Thus, if lambda* is accepted, there is some
important problem that P0057 does not address it efficiently. That should
give a pretty clear indication when you should not use P0057 and use
lambda* instead.

The second *some*, states that: "for *some* incarnation of lambda*". We
don't know what it is at the moment. Thus it is premature to worry how it
will interoperate. It might do so magnificently or not at all. We don't
know.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Germán Diago
2015-10-13 11:58:01 UTC
Permalink
Post by Evgeny Panasyuk
It is intrinsically non-zero overhead, due to type-erasure/allocations.
This fact alone is strong argument against it.
This is my main concern with the proposal: type-erasure and allocations.
Seems that there are some fancy optimizations possible, but I am not sure
why we should rely on these when there are other alternatives.
Post by Evgeny Panasyuk
generality and performance.
I agree with this. On top of that I do think you can implement everything on
top of this without polluting the core language.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-13 13:25:22 UTC
Permalink
Post by Evgeny Panasyuk
This is my main concern with the proposal: type-erasure and allocations.
Seems that there are some fancy optimizations possible, but I am not sure
why we should rely on these when there are other alternatives.
At the moment, there is NO alternative that can match zero-overhead of the
P0057. If you believe there is one, I offered you earlier in this thread a
way how you can validate whether you belief is true or not.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Arash Partow
2015-10-04 04:01:49 UTC
Permalink
Post by Nicol Bolas
The P0057 paper itself is actual wording, ready to be incorporated into the
standard. Not only that, P0057 has actual, live implementation experience
behind it. You can go get VS2015 right now and play with their
implementation of a version of this functionality.
P0114 seems more... experimental. It sounds like something that has been
discussed to some degree, but is as of yet lacking a proof-of-concept
implementation.
I believe CK may have a POC implementation available - that is a set
of patches against clang that adds the various features needed eg:
'resumable' et al.



That said, no single proposal has yet discussed why a compliant
compiler will never be able to deduce such scenarios without extra
keywords and rearranging of code - Similar things have been done to
achieve tail call optimizations, why can this not be done with
coroutines?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-04 13:26:37 UTC
Permalink
Post by Arash Partow
That said, no single proposal has yet discussed why a compliant
compiler will never be able to deduce such scenarios without extra
keywords and rearranging of code - Similar things have been done to
achieve tail call optimizations, why can this not be done with
coroutines?
... I don't understand what you mean.

Again, this may just be my ignorance on the details of these coroutine
ideas, but I thought the idea of resumable functions (P0057), at its core,
was the ability to halt the execution of a function (yield) and later
return to where that function's execution was halted (resume). That is, to
my understanding, the core feature.

How could the compiler *deduce* that I want to do that... without giving me
syntax to actually do that (at which point it's not "deduction" anymore)?

You may be confusing P0057 with a pure await/async model that is bound to
CPU threadings and so forth. It's rather lower-level than that. And even
that is not something which is deducible by the compiler, since it very
much affects the apparent behavior of the program.

By contrast, proper tail calls is not apparent behavior. Well, not with
regard to the standard at any rate.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Nicol Bolas
2015-10-10 14:22:40 UTC
Permalink
On the recent `operator await` syntax in P0057. Is it possible for a user
to call this operator these themselves? And if so, will it work correctly
for types that don't provide one (that is, resolving to the original type
or issuing a compiler error)?

If not, it would probably be useful if the user could invoke it themselves.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-10 14:58:13 UTC
Permalink
Post by Nicol Bolas
On the recent `operator await` syntax in P0057. Is it possible for a user
to call this operator these themselves? And if so, will it work correctly
for types that don't provide one (that is, resolving to the original type
or issuing a compiler error)?
If not, it would probably be useful if the user could invoke it themselves.
Absolutely, observe me doing this in the main of the attached program:

int main() {

operator await(1ms);

One thing that is somewhat awkward:

operator await(1ms);

is not the same as

await 1ms;

The first one, gets me an awaitable from 1ms. Another is await-ing 1ms.
Difference between calling an operator function directly or calling it via
operator notation, is not exactly novel.
The behavior of (x || y) is different from calling operator||(x,y).
I don't like it much. I don't mind operator await to be renamed to
get_awaiter or something, but, I do think that operator await is prettier
and it is easier to think that language synthesizes operator await for some
types rather than synthesizing a function.

Again, I am slightly leaning toward operator await, rather than
get_awaiter, but if Core/Evolution/LWG/LEWG wants something else.
Absolutely.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Gor Nishanov
2015-10-10 15:01:43 UTC
Permalink
Correction. I think you meant: (referring to classes in opawait attached in
the previous response)

operator await(awaiter{ 1ms });


not


operator await(1ms);


Yes, I think the wording is written to make it possible, however, I just
checked, the implementation we ship in VS Update 1 does not do that. I
filed a bug against myself.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
Loading...