Discussion:
Variadic append for std::string
(too old to reply)
Olaf van der Spek
2016-12-28 08:50:40 UTC
Permalink
Hi,

One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);

append(s, "A", "B", 42);

Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/6dbc3e8e-8b74-4c5e-b7c5-4b90e08239b8%40isocpp.org.
l***@gmail.com
2016-12-28 10:03:18 UTC
Permalink
Hello,

It would be useful, but will however probably cause temporary string in
variadic iteration (I'm not a variadic template expert).

but my dream is more to have a library like this one
standardized https://github.com/fmtlib/fmt

Laurent
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/12962001-6b0f-4652-83f6-d8c70485d133%40isocpp.org.
Victor Zverovich
2017-01-14 21:02:02 UTC
Permalink
Hello,

The author of the fmt library here. FWIW I've been working on a proposal to
introduce similar formatting functionality based on variadic templates to
the standard: http://fmtlib.net/Text%20Formatting.html . It is still a very
early and incomplete draft but I'd be glad to hear feedback.

With this functionality, one could do something like

std::string s = std::format("{}{}{}", "A", "B", 42);

This is somewhat more general than what the OP proposes because it allows
format specifiers and output targets other than strings. At the same time,
if properly implemented, it solves the problem of extra allocations and can
have performance similar to that of sprintf.

Any comments are very welcome.

Cheers,
Victor
Post by l***@gmail.com
Hello,
It would be useful, but will however probably cause temporary string in
variadic iteration (I'm not a variadic template expert).
but my dream is more to have a library like this one standardized
https://github.com/fmtlib/fmt
Laurent
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/6b156c7a-ff97-448e-81ca-ae8322fe1417%40isocpp.org.
g***@gmail.com
2017-01-14 21:35:44 UTC
Permalink
I'm looking forward to formatting being proposed for the standard.

You touched on reducing allocations, I think their should be a variant of
the interface that reuses an existing string to format into.
std::string reuse_ms;
e.g. std::fmt_existing(reuse_me, ...);

This would raise the possibility of no additional allocations in some
situations.
But particularly in a loop.
The user may also resize/reserve the string before the loop then to reduce
the chance of any unexpected allocations or exceptions.
They could aim to set a worst case size.

I haven't looked closely at fmt so apologies if this facility already
exists.
Post by l***@gmail.com
Hello,
The author of the fmt library here. FWIW I've been working on a proposal
to introduce similar formatting functionality based on variadic templates
to the standard: http://fmtlib.net/Text%20Formatting.html . It is still a
very early and incomplete draft but I'd be glad to hear feedback.
With this functionality, one could do something like
std::string s = std::format("{}{}{}", "A", "B", 42);
This is somewhat more general than what the OP proposes because it allows
format specifiers and output targets other than strings. At the same time,
if properly implemented, it solves the problem of extra allocations and can
have performance similar to that of sprintf.
Any comments are very welcome.
Cheers,
Victor
Post by l***@gmail.com
Hello,
It would be useful, but will however probably cause temporary string in
variadic iteration (I'm not a variadic template expert).
but my dream is more to have a library like this one standardized
https://github.com/fmtlib/fmt
Laurent
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/e7197b97-9a3f-48b9-9e72-7c962461d969%40isocpp.org.
Victor Zverovich
2017-01-14 22:03:56 UTC
Permalink
I agree that it would be useful to have a formatting function that appends
to existing string. Incidentally, the facility to append to strings and
containers with contiguous storage was recently contributed to
fmt: https://github.com/fmtlib/fmt/pull/450
Post by g***@gmail.com
I'm looking forward to formatting being proposed for the standard.
You touched on reducing allocations, I think their should be a variant of
the interface that reuses an existing string to format into.
std::string reuse_ms;
e.g. std::fmt_existing(reuse_me, ...);
This would raise the possibility of no additional allocations in some
situations.
But particularly in a loop.
The user may also resize/reserve the string before the loop then to reduce
the chance of any unexpected allocations or exceptions.
They could aim to set a worst case size.
I haven't looked closely at fmt so apologies if this facility already
exists.
Post by l***@gmail.com
Hello,
The author of the fmt library here. FWIW I've been working on a proposal
to introduce similar formatting functionality based on variadic templates
to the standard: http://fmtlib.net/Text%20Formatting.html . It is still
a very early and incomplete draft but I'd be glad to hear feedback.
With this functionality, one could do something like
std::string s = std::format("{}{}{}", "A", "B", 42);
This is somewhat more general than what the OP proposes because it allows
format specifiers and output targets other than strings. At the same time,
if properly implemented, it solves the problem of extra allocations and can
have performance similar to that of sprintf.
Any comments are very welcome.
Cheers,
Victor
Post by l***@gmail.com
Hello,
It would be useful, but will however probably cause temporary string in
variadic iteration (I'm not a variadic template expert).
but my dream is more to have a library like this one standardized
https://github.com/fmtlib/fmt
Laurent
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/835808b1-39cb-4769-b808-42b81d0c7c00%40isocpp.org.
g***@gmail.com
2017-01-14 22:33:01 UTC
Permalink
Post by Victor Zverovich
I agree that it would be useful to have a formatting function that appends
to existing string. Incidentally, the facility to append to strings and
https://github.com/fmtlib/fmt/pull/450
I was just in the process of replying to say that decoupling formatting
from string would be great. And that seems to do that.
I didn't see exactly how it's used but it seems (but I'm sure someone will
correct me here) it would be nice if this worked:

std::vector v;
So fmt(v. "whatever");

I wonder if the standard containers could expose themselves as buffers that
would enable that or this if it's better:

fmt(as_buffer(v), "whatever"); or fmt(v.as_buffer(), "whatever") if that's
better.

I don't see why std::array couldn't be made to work too etc. Does
your wrapper support array?

It seems it would be possible, or at least if the standard could help here.
Nicol's/boosts interface to default constructed allocation would seem
useful to employ in this interface for fmt.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/0a11a519-7383-43d5-a342-5fda9e1abb7a%40isocpp.org.
Thiago Macieira
2017-01-15 02:31:42 UTC
Permalink
Post by g***@gmail.com
I was just in the process of replying to say that decoupling formatting
from string would be great. And that seems to do that.
I didn't see exactly how it's used but it seems (but I'm sure someone will
std::vector v;
So fmt(v. "whatever");
Sorry, but.. why?

Why can't you just use std::string? Why does it need to be something
different?

The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7027114.Za510D0td2%40tjmaciei-mobl1.
g***@gmail.com
2017-01-15 03:58:30 UTC
Permalink
Post by Thiago Macieira
Post by g***@gmail.com
I was just in the process of replying to say that decoupling formatting
from string would be great. And that seems to do that.
I didn't see exactly how it's used but it seems (but I'm sure someone
will
Post by g***@gmail.com
std::vector v;
So fmt(v. "whatever");
Sorry, but.. why?
Why can't you just use std::string? Why does it need to be something
different?
The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
I wasn't saying you can't use a string, I was trying to say that it seems
to be fmt should be able to format into more than just a string.
Like an std::array, or std::vector or or whatever, via an adaptor or
otherwise. anything conforming to a buffer concept of some kind.

i.e. any thing that can present a block of memory containing characters.

quite how that works I'm not sure yet, I specifically wouldn't be keen to
only a format that worked only with a string and then need to copy that
somewhere else unless there was a good reason. There may be a good reason,
but it seems we should be aware of this.
I also don't want to format have to create that thing either, so I don't
see why we should be forced to dynamically create memory in order to format
which a string only thing would force.
If string has a member called fmt that defers to some outer fomat, fine if
that's a helper, but if a string was the sole means to format, I think
that's not a great thing without convincing motivation.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/ff062649-7949-480e-a836-64cf59e0a75c%40isocpp.org.
Thiago Macieira
2017-01-15 08:23:04 UTC
Permalink
Post by g***@gmail.com
Post by Thiago Macieira
Sorry, but.. why?
Why can't you just use std::string? Why does it need to be something
different?
The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
I wasn't saying you can't use a string, I was trying to say that it seems
to be fmt should be able to format into more than just a string.
Like an std::array, or std::vector or or whatever, via an adaptor or
otherwise. anything conforming to a buffer concept of some kind.
Why can't we call that adaptor "std::string"?
Post by g***@gmail.com
i.e. any thing that can present a block of memory containing characters.
And that can be reallocated to extend its size, or truncated once the true
size is known. And is contiguous, of course.
Post by g***@gmail.com
quite how that works I'm not sure yet, I specifically wouldn't be keen to
only a format that worked only with a string and then need to copy that
somewhere else unless there was a good reason. There may be a good reason,
but it seems we should be aware of this.
Let's start with the good reason. It has to be good enough to overcome the
need to make the library function more complex for the 99.99% of the uses.

I know he C library does it, and that it can reuse the same formatters to
output to a preallocated string (snprintf) or a file (fprintf). But that
probably only works because the C library internally writes to a buffer and
flushes it periodically. So in order to do the same for us, for multiple
different output classes, the formatting function would likely have to have
its own buffering. That's what I meant when I said it would be more complex
than it needs to be for the 99.99% of the uses.
Post by g***@gmail.com
I also don't want to format have to create that thing either, so I don't
see why we should be forced to dynamically create memory in order to format
which a string only thing would force.
If it can't allocate or reallocate, then it must be prepared for failing in
case of buffer overrun. When formatting, you usually don't want that because
formatting can't be easily restarted from where it failed.
Post by g***@gmail.com
If string has a member called fmt that defers to some outer fomat, fine if
that's a helper, but if a string was the sole means to format, I think
that's not a great thing without convincing motivation.
See above.

I'd like to hear the motivation for formatting to anything else, compared to
the added complexity to make it work. In other words: is it worth it?
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1727123.2OmLujIbrJ%40tjmaciei-mobl1.
Magnus Fromreide
2017-01-15 08:54:16 UTC
Permalink
Post by Thiago Macieira
Post by g***@gmail.com
Post by Thiago Macieira
Sorry, but.. why?
Why can't you just use std::string? Why does it need to be something
different?
The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
I wasn't saying you can't use a string, I was trying to say that it seems
to be fmt should be able to format into more than just a string.
Like an std::array, or std::vector or or whatever, via an adaptor or
otherwise. anything conforming to a buffer concept of some kind.
Why can't we call that adaptor "std::string"?
Post by g***@gmail.com
i.e. any thing that can present a block of memory containing characters.
And that can be reallocated to extend its size, or truncated once the true
size is known. And is contiguous, of course.
Post by g***@gmail.com
quite how that works I'm not sure yet, I specifically wouldn't be keen to
only a format that worked only with a string and then need to copy that
somewhere else unless there was a good reason. There may be a good reason,
but it seems we should be aware of this.
Let's start with the good reason. It has to be good enough to overcome the
need to make the library function more complex for the 99.99% of the uses.
I think that one primary motivation for a fmt type interface is to ease
internationalization efforts. From there follows that one common output
target would be a streambuf. Now, I know that streams and performance in the
same sentence is dodgy at best but why make it even worse with an unneeded
temporary std::string?

/MF
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/20170115085416.GA23451%40noemi.
Thiago Macieira
2017-01-15 19:13:32 UTC
Permalink
Post by Magnus Fromreide
I think that one primary motivation for a fmt type interface is to ease
internationalization efforts. From there follows that one common output
target would be a streambuf. Now, I know that streams and performance in the
same sentence is dodgy at best but why make it even worse with an unneeded
temporary std::string?
streambuf is fine, I guess. We need one output, not a template/concept.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/4588951.Nx5ORMnJHp%40tjmaciei-mobl1.
g***@gmail.com
2017-01-15 22:35:08 UTC
Permalink
Post by Nicol Bolas
Post by Magnus Fromreide
I think that one primary motivation for a fmt type interface is to ease
internationalization efforts. From there follows that one common output
target would be a streambuf. Now, I know that streams and performance in
the
Post by Magnus Fromreide
same sentence is dodgy at best but why make it even worse with an
unneeded
Post by Magnus Fromreide
temporary std::string?
streambuf is fine, I guess. We need one output, not a template/concept.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
I don't know for sure, but if it took bets I suspect a template/concept is
exactly what you want here.
Or at least one of the options.

I don't know what the template/concept would be, but at a minimum it would
be a buffer template/concept.
I think we need this anyway. At a maximum maybe something like formatable
or something is needed
that employed the former somehow. But I've no idea and I haven't looked at
what cpp format does hardly.
But it seems to have such words there and I'm sure it has the feature set
covered.
The issue to me is how much nicer/simpler it can be if it's in the standard
and the standard can change to accommodate making it nicer.

If the end result looked anything less than easy to use or fast for string
then I wouldn't object
to string having a special case that addressed this.

But I seems remiss if don't support these use cases:

extern "C" bool get_logfilename(char* buffer, int buffer_length)
{
fixed_buffer fb(buffer, buffer_length), whatever);
auto status = cpp::format(fb, whatever);
return !status.failed();
}

std::pair<bool, std::size_t> get_logfilename(char* buffer, int
buffer_length)
{
fixed_buffer fb(buffer, buffer_length), whatever);
auto format_status = cpp::format(fb, whatever);
if (format_status.failed())
return {false,{}};
return {true, buffer, format_satus.format_length);
}

std::pair<bool, std::size_t> get_logfilename(char* buffer, int
buffer_length)
{
auto fb { std::make_fixed_buffer(buffer,, buffer_length) };
auto format_status = cpp::format(fb, whatever);
if (format_status.failed())
return {false,{}};
return {true, buffer, format_satus.format_length);
}

std::string get_logfilename()
{
std::string logfilename

strng_buffer sb(logfilename);
auto format_status = cpp::format(sb, whatever);
return logfilename;
}

So we have these adaptors that adapt types as needed, .e.g. string_buffer
etc.
They model a buffer concept that would seem to require this:
is_fixed_size()
size()
resize()
resize_default_init(); // The boost/nicol proposal
capacity()

The fixed_buffer type provides an interface that models buffer
but allow writing into an arbitrary fixed size memory region.
A buffer that can't resize has is_fixed_size return true.

And then the other adaptors like string_buffer as needed but:
I don't see why std::unique_ptr can't also be invited to the party here.
But if not, oh well.


But I see no reason why std::string, std::vector, std::array can't model a
buffer directly but they don't have to.
If they did, wouldn't you be down to:

std::string get_logfilename()
{
std::string logfilename

auto format_status = cpp::format(logfilename, whatever);
return logfilename;
}

I don't want anyone to get hung up on this particular interface. The main
things
to me (in order of importance) is what the interface support these goals,
* we don't require memory allocation to format. (essential)
* we support types beyond string unless there is a good reason not to.
* we can format and avoid exceptions if we wish
* we can diagnose when something failed and why so we can then throw if
it's important.
* we enable init/default initialized resize to be used here to help
efficiency. it seems formatting can use this.

We need to know how certain formatting conditions fail and not just have
only a failed bit to check.
IMO we need to know at least:
out of memory (malloc/new failure),
buffer full (i.e. wanted to resize() but is_fixed_size() said no),
bad format string,
bad argument

I think we need to maximally attempt to format as much as possible and not
throw.
but format_status can contain an std::expected or something that exactly
says the issue.
you can expect that but typically wouldn't.

But lets see evidence supports these positions.

All of this seems achievable and cpp format seems to provide a lot of this
feature set.

To me it's just seeing if cpp format does support this needs and how and
why not and then
seeing how the Standard needs to change to make formatting even
easier/simpler than
whatever cpp format already does without that support today.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/0136a5fe-14a0-41d8-ae27-330cd513fde8%40isocpp.org.
g***@gmail.com
2017-01-15 11:00:51 UTC
Permalink
Post by Thiago Macieira
Post by g***@gmail.com
I wasn't saying you can't use a string, I was trying to say that it
seems
Post by g***@gmail.com
to be fmt should be able to format into more than just a string.
Like an std::array, or std::vector or or whatever, via an adaptor or
otherwise. anything conforming to a buffer concept of some kind.
Why can't we call that adaptor "std::string"?
Well nobody knows for sure until we seen the options and what the problems
are with each option. And it might mean that their are several interfaces
required. cpp format seems to have a few so that probably tells you that we
will need a few and more than just string.

but to me in theory at least, if there is going to be only one interface,
it can't be string as having to allocate to do a format is a non starter to
me.
it's too slow. I don't see it as viable that you have to allocate to format.
Post by Thiago Macieira
Post by g***@gmail.com
i.e. any thing that can present a block of memory containing characters.
And that can be reallocated to extend its size, or truncated once the true
size is known. And is contiguous, of course.
Post by g***@gmail.com
quite how that works I'm not sure yet, I specifically wouldn't be keen
to
Post by g***@gmail.com
only a format that worked only with a string and then need to copy that
somewhere else unless there was a good reason. There may be a good
reason,
Post by g***@gmail.com
but it seems we should be aware of this.
Let's start with the good reason. It has to be good enough to overcome the
need to make the library function more complex for the 99.99% of the uses.
I know he C library does it, and that it can reuse the same formatters to
output to a preallocated string (snprintf) or a file (fprintf). But that
probably only works because the C library internally writes to a buffer and
flushes it periodically. So in order to do the same for us, for multiple
different output classes, the formatting function would likely have to have
its own buffering. That's what I meant when I said it would be more complex
than it needs to be for the 99.99% of the uses.
Post by g***@gmail.com
I also don't want to format have to create that thing either, so I don't
see why we should be forced to dynamically create memory in order to
format
Post by g***@gmail.com
which a string only thing would force.
If it can't allocate or reallocate, then it must be prepared for failing in
case of buffer overrun. When formatting, you usually don't want that because
formatting can't be easily restarted from where it failed.
I think being prepared for failure is always a reality isn't it?
reallocation can fail, so?
but having to do memory allocation thing is the main issue here for me and
string is all that.

if I have a C interface where the caller passes me an array that's large
enough, why should I have to allocate memory to use it.
aren't I at risk of creating a string intensive api which is often a slow
down for many apps?

just string isn't enough
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/b1a9b65a-42c7-4964-b9d0-1c4464bacba8%40isocpp.org.
Thiago Macieira
2017-01-15 19:17:01 UTC
Permalink
Post by g***@gmail.com
I think being prepared for failure is always a reality isn't it?
reallocation can fail, so?
but having to do memory allocation thing is the main issue here for me and
string is all that.
When a memory allocation fails, you get std::bad_alloc, which throws away the
entire attempt. So you don't attempt to continue, you just fail completely and
your out buffer contains unspecified contents.

Sized format functions like snprintf print as much as they can, then return it
saying they actually needed more. That's more complex.
Post by g***@gmail.com
if I have a C interface where the caller passes me an array that's large
enough, why should I have to allocate memory to use it.
aren't I at risk of creating a string intensive api which is often a slow
down for many apps?
Understood. I am saying that has to be weighed against the normal use-case
which is to output to a std::string. That has to be EASY to write and perform
really well.
Post by g***@gmail.com
just string isn't enough
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/49574038.x7SjvB0W84%40tjmaciei-mobl1.
Victor Zverovich
2017-01-15 15:51:50 UTC
Permalink
Post by Thiago Macieira
Why can't we call that adaptor "std::string"?
We can use std::string but it will add extra copying and memory allocation
when formatting anywhere else. For this reason the fmt library decouples
buffer management and formatting.
Post by Thiago Macieira
Let's start with the good reason. It has to be good enough to overcome the
need to make the library function more complex for the 99.99% of the uses.
See above. Also the complexity added by making buffer management more
generic is very small compared to the complexity of the actual formatting.
And formatting to std::string, although one of the main use cases, from my
experience is not anywhere close to 99%.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CANawtxZftVjJ9ynSDYa1jwiVZ77qStuFv4rwzWVxF6Gwwg8n7Q%40mail.gmail.com.
g***@gmail.com
2017-01-15 04:04:23 UTC
Permalink
Post by Thiago Macieira
Post by g***@gmail.com
I was just in the process of replying to say that decoupling formatting
from string would be great. And that seems to do that.
I didn't see exactly how it's used but it seems (but I'm sure someone
will
Post by g***@gmail.com
std::vector v;
So fmt(v. "whatever");
Sorry, but.. why?
Why can't you just use std::string? Why does it need to be something
different?
The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
I could also imagine wanting to have a unique_ptr or char array from C that
I want to format into. I can write some adaptor to expose that if it
doesn't have anything already that adapts it. So no string at at all there.
I think the OP has these situations covered. It's just about how much
better they would be if the library comes into the standard and the
standard can adapt to make things even smoother for the library then if
need be.
I can imagine the standard help with such adaptors if that's what it needs.

Anyway I don't know how it would work but the emphasis is on not just
having string as the only sink for formatting is what I'm saying. And
having the standard make those other things very usable out of the box.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c5f30037-32c8-492a-8317-df79fb4dc3e1%40isocpp.org.
Tony V E
2017-06-17 22:42:10 UTC
Permalink
Post by Thiago Macieira
Post by g***@gmail.com
I was just in the process of replying to say that decoupling formatting
from string would be great. And that seems to do that.
I didn't see exactly how it's used but it seems (but I'm sure someone
will
Post by g***@gmail.com
std::vector v;
So fmt(v. "whatever");
Sorry, but.. why?
Why can't you just use std::string? Why does it need to be something
different?
I might want to format into a QString? Without the copy in between?
Post by Thiago Macieira
The formatting code is likely to be big, so it's most likely not going to be
inline. Therefore, it can't be templated.
I agree there might be trade offs that need to be weighed.
I don't think we really know what they will be yet.
--
Be seeing you,
Tony
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbitrot%3Dkin8LA_h97ATwuaKaMMLebSywjYpYiySF1-Jc5Q%40mail.gmail.com.
Andrey Semashev
2016-12-28 11:12:17 UTC
Permalink
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
I think, conceptually, formatting should be decoupled from string. It
would probably make sense to have `append` work with strings (although
template expressions would be even better, I think), but it should not
deal with formatting stuff.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/56af4902-fe46-6202-56d3-865fd5c2e307%40gmail.com.
Thiago Macieira
2016-12-28 12:20:54 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 00:50:40 BRST, Olaf van der Spek
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
It can also be solved without code change, at least the string parts. The
following code in Qt does exactly one allocation:

#define QT_USE_FAST_OPERATOR_PLUS
#include <qstring.h>

QString s;
s += "A" + QLatin1String("B") + '.' + QLatin1Char(';');

This expression will do a two-pass scan of all the arguments: first, it
calculates their maximum sizes (some of them may shorted when converted from
UTF-8 to UTF-16). Once that is known, it increase s's storage, memcpys the
data or does an in-place conversion into the buffer, then it shrinks s's size
to the actual size (no reallocation).

The only reason why you have to have that #define is that the plus expression
results in a QStringBuilder<...> template instance instead of QString, so an
expression like:

(QLatin1String("%1 ") + types).arg(n)

Is valid without it but will fail to compile with the macro. The solution is
an explicit cast to QString:

QString(QLatin1String("%1 ") + types).arg(n)

The fast operator plus is also enabled for QByteArray.

Without the macro, you can use the otherwise-unused operator%:

s += "A" % QLatin1String("B") % '.' % QLatin1Char(';');
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2222132.2dy7r4G1i9%40tjmaciei-mobl1.
Olaf van der Spek
2016-12-28 18:09:30 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 00:50:40 BRST, Olaf van der
Spek
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
It can also be solved without code change, at least the string parts. The
The non-string parts are kinda important.
Depending on a macro definition is a no-go and overloading operator+ for
all kinds of types is probably not a good idea.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/718a0b20-4127-4f65-bad1-484bfbc36575%40isocpp.org.
Thiago Macieira
2016-12-29 03:35:08 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 10:09:30 BRST, Olaf van der Spek
Post by Olaf van der Spek
Post by Thiago Macieira
Post by Olaf van der Spek
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
It can also be solved without code change, at least the string parts. The
The non-string parts are kinda important.
I disagree. Anything but other strings and elements of strings (characters)
should be done with the proper string formatting functions. Otherwise, we'll
soon have someone asking for hex formatting, zero padding, etc. We already
have the right tools for that.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2519978.5zMBLT7kQj%40tjmaciei-mobl1.
Nicol Bolas
2016-12-29 05:45:19 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 10:09:30 BRST, Olaf van der
Spek
Post by Olaf van der Spek
Post by Thiago Macieira
Post by Olaf van der Spek
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
It can also be solved without code change, at least the string parts.
The
Post by Olaf van der Spek
The non-string parts are kinda important.
I disagree. Anything but other strings and elements of strings
(characters)
should be done with the proper string formatting functions. Otherwise, we'll
soon have someone asking for hex formatting, zero padding, etc. We already
have the right tools for that.
We kinda have the right tools for that ;) We still don't have a decent
C++-ified printf, despite having had variadic templates for 2 language
revisions now.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c5f32f39-5e29-4c4c-9cd9-113a3b5fd386%40isocpp.org.
Olaf van der Spek
2016-12-29 08:44:31 UTC
Permalink
Post by Thiago Macieira
Em quarta-feira, 28 de dezembro de 2016, às 10:09:30 BRST, Olaf van der Spek
Post by Olaf van der Spek
Post by Thiago Macieira
Post by Olaf van der Spek
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
It can also be solved without code change, at least the string parts. The
The non-string parts are kinda important.
I disagree. Anything but other strings and elements of strings (characters)
should be done with the proper string formatting functions. Otherwise, we'll
What proper functions would that be?
Are they as performant as the proposed functions?
Post by Thiago Macieira
soon have someone asking for hex formatting, zero padding, etc. We already
have the right tools for that.
Do we?
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HPa_a0_VkaQpVtzTcF8PibdOa%2B_xpVtpVHDJbZ%2Bq0X5bw%40mail.gmail.com.
Thiago Macieira
2016-12-29 12:19:10 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 09:44:31 BRST, Olaf van der Spek
Post by Olaf van der Spek
Post by Thiago Macieira
I disagree. Anything but other strings and elements of strings (characters)
should be done with the proper string formatting functions. Otherwise, we'll
What proper functions would that be?
I think in the standard library, that's std::stringstream. I don't use it, so
I wouldn't know.

QString has them built in: the .arg() overloads.
Post by Olaf van der Spek
Are they as performant as the proposed functions?
They don't have to be because they serve different purposes. We also need
something that can support internationalisation (i18n) and concatenation with
plus operators can't do that.
Post by Olaf van der Spek
Post by Thiago Macieira
soon have someone asking for hex formatting, zero padding, etc. We already
have the right tools for that.
Do we?
Yes.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3700470.JfTvTltSib%40tjmaciei-mobl1.
Victor Dyachenko
2016-12-29 06:54:32 UTC
Permalink
On Wednesday, December 28, 2016 at 11:50:40 AM UTC+3, Olaf van der Spek
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
I think operator<<() would be better here:

s << "A" << "B" << to_string(42);

And about support for non-character types. It is useful at least for
diagnostics. No need for powerful formatting here, just the ability to
convert any fundamental type to text of any form. iostreams are to
cumbersom, printf is not generic (one need to specify exact specifier for
the type).
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/b2e68bd9-f296-4707-aea6-29ca04993a61%40isocpp.org.
Thiago Macieira
2016-12-29 12:20:59 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 22:54:32 BRST, Victor Dyachenko
Post by Victor Dyachenko
s << "A" << "B" << to_string(42);
And about support for non-character types. It is useful at least for
diagnostics. No need for powerful formatting here, just the ability to
convert any fundamental type to text of any form. iostreams are to
cumbersom, printf is not generic (one need to specify exact specifier for
the type).
Why can't you use std::stringstream or another std::ostream here? I know
you're saying it's cumbersome, and I agree that the iostreams part of the
standard library is an extreme overkill (using polymorphism for things that
didn't need it). Still, we have the tool.

Why not fix iostreams instead?
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7607381.61kQ144YtO%40tjmaciei-mobl1.
Victor Dyachenko
2016-12-29 12:35:57 UTC
Permalink
Em quarta-feira, 28 de dezembro de 2016, às 22:54:32 BRST, Victor
Dyachenko
Post by Victor Dyachenko
s << "A" << "B" << to_string(42);
And about support for non-character types. It is useful at least for
diagnostics. No need for powerful formatting here, just the ability to
convert any fundamental type to text of any form. iostreams are to
cumbersom, printf is not generic (one need to specify exact specifier
for
Post by Victor Dyachenko
the type).
Why can't you use std::stringstream or another std::ostream here? I know
you're saying it's cumbersome, and I agree that the iostreams part of the
standard library is an extreme overkill (using polymorphism for things that
didn't need it). Still, we have the tool.
"We have unusable tool. Nobody uses it including myself, but we have it!"
:-)
Why not fix iostreams instead?
Because it is not fixable by design. It tries to be everything, so any
implementation will be bloated. Dependency on the locales, which weights
more than 1MB per se, states everything (formatting parameters, flags,
etc), virtual calls, et al. I don't require anything of that just to build
the error message in the small function, like this:

result_t res = call(...);
if(failed(res)) throw std::logical_error(std::string() << "The call()
returned " << res);
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/79877a4b-497a-4b6e-9e03-e59c1d17cffe%40isocpp.org.
Victor Dyachenko
2016-12-29 12:42:50 UTC
Permalink
FIX: s/ states everything / states everywhere /
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/727df93a-3992-4c48-b729-cdc8815b5166%40isocpp.org.
Michał Dominiak
2016-12-29 13:02:20 UTC
Permalink
Ah yes, because obviously the primary thing that `std::string`'s interface
needs is *more functionality*. </s>
Post by Victor Dyachenko
FIX: s/ states everything / states everywhere /
--
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an
To view this discussion on the web visit
https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/727df93a-3992-4c48-b729-cdc8815b5166%40isocpp.org
<https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/727df93a-3992-4c48-b729-cdc8815b5166%40isocpp.org?utm_medium=email&utm_source=footer>
.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAPCFJdTiV0-PRcAA-XLMnUkh%2BjJx4NdO-uvrAbc5814WD2Us9g%40mail.gmail.com.
Andrey Semashev
2016-12-29 13:09:59 UTC
Permalink
Post by Thiago Macieira
Why not fix iostreams instead?
Because it is not fixable by design. It tries to be everything, so any
implementation will be bloated. Dependency on the locales, which weights
more than 1MB per se, states everything (formatting parameters, flags,
etc), virtual calls, et al. I don't require anything of that just to
|
result_t res =call(...);
if(failed(res))throwstd::logical_error(std::string()<<"The call()
returned "<<res);
|
std::string is already bloated too much. Adding yet more bloat is hardly
the way to go.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/9f62f8c0-5a9a-64db-a2b7-d4b84d0f7d48%40gmail.com.
Olaf van der Spek
2016-12-29 13:11:56 UTC
Permalink
std::string is already bloated too much. Adding yet more bloat is hardly the
way to go.
We're not requesting stuff to be added to std::string. ;)
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HNihfbHp2n8vWED8HJc76-8j8PQE2is3xiOO5wyvBuoGQ%40mail.gmail.com.
Andrey Semashev
2016-12-29 13:56:52 UTC
Permalink
Post by Olaf van der Spek
std::string is already bloated too much. Adding yet more bloat is hardly the
way to go.
We're not requesting stuff to be added to std::string. ;)
Adding operator<< overloads for std::string is adding to std::string. As
is creating a specialized append() algorithm that is targeted
specifically at string manipulation.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/272bf944-2a46-7fdc-057d-68ddf36fca70%40gmail.com.
Olaf van der Spek
2016-12-29 14:00:47 UTC
Permalink
Post by Olaf van der Spek
std::string is already bloated too much. Adding yet more bloat is hardly the
way to go.
We're not requesting stuff to be added to std::string. ;)
Adding operator<< overloads for std::string is adding to std::string. As is
creating a specialized append() algorithm that is targeted specifically at
string manipulation.
What are you suggesting?
I'm fine with s being a template parameter allowing you to maybe use
it for file or vector<char> too...

A string algo is exactly what (some) people need..
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HMdic9fcotOrFvfUfeu9uWbGDvPOwDT8mqJzt26Q966LQ%40mail.gmail.com.
Andrey Semashev
2016-12-29 14:12:55 UTC
Permalink
Post by Olaf van der Spek
Post by Olaf van der Spek
std::string is already bloated too much. Adding yet more bloat is hardly the
way to go.
We're not requesting stuff to be added to std::string. ;)
Adding operator<< overloads for std::string is adding to std::string. As is
creating a specialized append() algorithm that is targeted specifically at
string manipulation.
What are you suggesting?
I'm fine with s being a template parameter allowing you to maybe use
it for file or vector<char> too...
A string algo is exactly what (some) people need..
I don't have a proposal, but if that's a generic formatting algorithm
then it should not be coupled with std::string and of course it should
not be named append() (because, well, appending is the least of the
things it does). I think, a C++ version of sprintf() is a frequently
requested feature, and it seems it could potentially fit your case as well.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/93a5b8da-5d1e-a664-03a8-b11d3bab6529%40gmail.com.
Olaf van der Spek
2016-12-29 14:24:33 UTC
Permalink
I don't have a proposal, but if that's a generic formatting algorithm then
From my first post: "It could support string_view, integers, maybe floats
**but without formatting options**.."
it should not be coupled with std::string and of course it should not be
named append() (because, well, appending is the least of the things it
does). I think, a C++ version of sprintf() is a frequently requested
feature, and it seems it could potentially fit your case as well.
Maybe, but the formatting options and string make it vastly more
complex. It'd also have a different interface. Maybe it'd even use my
proposed append function internally.. ;)
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HNEiktWy9iHqCfprN5G3AteHZqAhsMBmDLaU5u7o3KXdA%40mail.gmail.com.
Andrey Semashev
2016-12-29 17:23:43 UTC
Permalink
Post by Olaf van der Spek
I don't have a proposal, but if that's a generic formatting algorithm then
From my first post: "It could support string_view, integers, maybe floats
**but without formatting options**.."
Well, it may not support formatting options, but as long as it converts
arbitrary data to strings (or anything that can act like a string), it's
still a generic formatting algorithm.
Post by Olaf van der Spek
it should not be coupled with std::string and of course it should not be
named append() (because, well, appending is the least of the things it
does). I think, a C++ version of sprintf() is a frequently requested
feature, and it seems it could potentially fit your case as well.
Maybe, but the formatting options and string make it vastly more
complex. It'd also have a different interface. Maybe it'd even use my
proposed append function internally.. ;)
If we're talking about an sprintf() equivalent, I don't think a more
flexible formatting implementation can be based on top of a less
flexible one (that is, your proposed function).
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/a5393b6f-be64-1b81-ecb9-d59f9a666ad7%40gmail.com.
Victor Dyachenko
2016-12-29 13:16:23 UTC
Permalink
Post by Andrey Semashev
std::string is already bloated too much.
Agree.
Post by Andrey Semashev
Adding yet more bloat is hardly
the way to go.
Specifically these features can be implemented in 50 lines of code using
sprintf(), and in a few hundred w/o any dependencies.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/83d555d1-c799-4f6d-9969-b1c052f760d6%40isocpp.org.
Andrey Semashev
2016-12-29 14:06:09 UTC
Permalink
Post by Andrey Semashev
Adding yet more bloat is hardly
the way to go.
Specifically these features can be implemented in 50 lines of code
using sprintf(), and in a few hundred w/o any dependencies.
The exact number of lines is not my only concern, although I doubt that
your esimate is accurate, at least in the "no added dependencies" case.
Correctly formatting FP numbers, in particular, sounds like a
complicated task. Then there is support for user-defined types, some of
which are probably ostreamable - would you not want existing operator<<
to be used? This brings a dependency on iostreams.

Conceptually, you're adding more functionality to std::string, which IMO
should be nothing more than a container of characters. Any formatting
tools should build on top of that container (and maybe even allow
different containers to be used) instead of hijacking it.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/4eefe155-b9f8-4d3f-089f-45f7c8f8667b%40gmail.com.
Nicol Bolas
2016-12-29 16:47:02 UTC
Permalink
Post by Andrey Semashev
Post by Thiago Macieira
Why not fix iostreams instead?
Because it is not fixable by design. It tries to be everything, so any
implementation will be bloated. Dependency on the locales, which weights
more than 1MB per se, states everything (formatting parameters, flags,
etc), virtual calls, et al. I don't require anything of that just to
|
result_t res =call(...);
if(failed(res))throwstd::logical_error(std::string()<<"The call()
returned "<<res);
|
std::string is already bloated too much. Adding yet more bloat is hardly
the way to go.
I find this "too bloated" argument to be unconvincing.

Does `std::basic_string` have functions that are, strictly speaking, not
necessary? Yes. But that doesn't mean you shouldn't add function which
actually *are necessary* to the interface. The fact that a type already has
unneeded member functions shouldn't stop you from putting needed ones in
there.

Now, you can argue that it actually *isn't* necessary, that there are ways
to achieve the same performance of a concatenation function without making
it a member. But "appeal to bloat" isn't a valid argument against it.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/6abd0160-dd78-4206-8f3e-83486c71f365%40isocpp.org.
Andrey Semashev
2016-12-29 17:18:28 UTC
Permalink
Post by Andrey Semashev
std::string is already bloated too much. Adding yet more bloat is hardly
the way to go.
I find this "too bloated" argument to be unconvincing.
Does `std::basic_string` have functions that are, strictly speaking, not
necessary? Yes. But that doesn't mean you shouldn't add function which
actually /are necessary/ to the interface. The fact that a type already
has unneeded member functions shouldn't stop you from putting needed
ones in there.
I don't think formatting qualifies as the "necessary" or "needed"
functions for std::string.
Post by Andrey Semashev
Now, you can argue that it actually /isn't/ necessary, that there are
ways to achieve the same performance of a concatenation function without
making it a member. But "appeal to bloat" isn't a valid argument against it.
It is, because that is exactly how std::string interface became bloated.
Consider the bunch of find*/rfind/compare member functions that are
currently present in std::string interface but could perfectly be
standalone. Some of them, in fact, already exist as standalone generic
algorithms, which a decent compiler will optimize to the same degree as
the dedicated member functions.

The proposed extension is worse than the mentioned member functions in
that it potentially brings new dependencies to implement formatting.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8a0771b3-4776-40d8-bc32-0a50cc4b7dab%40gmail.com.
Nicol Bolas
2016-12-29 21:58:59 UTC
Permalink
Post by Andrey Semashev
On Thursday, December 29, 2016 at 8:10:03 AM UTC-5, Andrey Semashev
std::string is already bloated too much. Adding yet more bloat is hardly
the way to go.
I find this "too bloated" argument to be unconvincing.
Does `std::basic_string` have functions that are, strictly speaking, not
necessary? Yes. But that doesn't mean you shouldn't add function which
actually /are necessary/ to the interface. The fact that a type already
has unneeded member functions shouldn't stop you from putting needed
ones in there.
I don't think formatting qualifies as the "necessary" or "needed"
functions for std::string.
And I do not think that the basis of the append functionality requires
including a "formatting" system. The basic proposal is fast sequential
string appending. Stuff added on top of that should not interfere with that
basic ideal.
Post by Andrey Semashev
Now, you can argue that it actually /isn't/ necessary, that there are
ways to achieve the same performance of a concatenation function without
making it a member. But "appeal to bloat" isn't a valid argument against
it.
It is, because that is exactly how std::string interface became bloated.
That's not my understanding of the evolution of this type.

As I understand it, what became known as `std::basic_string` was a regular
old concrete string class. It had exactly the kind of interface that string
classes would be expected to have: searching, replacing, etc. That's
typical of string classes, and there's no reason to consider that "bloat".
It *certainly* would not have been considered "bloated" at the time it was
designed.

However, when they moved to stick it into the standard, it was decided to
give it an STL interface in addition to its existing interface.

Consider the bunch of find*/rfind/compare member functions that are
Post by Andrey Semashev
currently present in std::string interface but could perfectly be
standalone. Some of them, in fact, already exist as standalone generic
algorithms, which a decent compiler will optimize to the same degree as
the dedicated member functions.
There is a fundamental difference between the argument you just outlined
and the "X is bloat and bloat is bad" you said before. The difference being
that the above is an actual argument, and the latter is an opinion.

If you want to say that it is not necessary to add appending functionality
to `std::string` itself, then make that argument. But saying that the class
is "bloated" and therefore adding *anything* is wrong is not a functional
argument.

It should also be noted that the primary reason why those member functions
continue to exist in `basic_string` (that is, why the committee did not
*replace* their interface with the STL-based one) is because they work with
integer offsets, rather than iterators/pointers. A *lot* of people do work
with strings based on integer offsets, and they're not going to rewrite all
of their code just because you think that iterators involve less "bloat".
In some cases, they *can't* rewrite their code, since it's in C. So rather
than making a string type that many people would reject, the committee made
sure that the member API would use offsets, while relying on algorithms for
those who want to use iterators.

If you feel that this is "bloat", that's your prerogative. But there is a
genuine reason behind these APIs.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2ca47cff-3e0f-4948-bd87-6325c048805c%40isocpp.org.
Thiago Macieira
2016-12-29 22:30:03 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 20:18:28 BRST, Andrey Semashev
Post by Andrey Semashev
It is, because that is exactly how std::string interface became bloated.
Consider the bunch of find*/rfind/compare member functions that are
currently present in std::string interface but could perfectly be
standalone. Some of them, in fact, already exist as standalone generic
algorithms, which a decent compiler will optimize to the same degree as
the dedicated member functions.
And yet hardly any compiler will optimise as well as the dedicated copies of
those functions that exist for QString inside QtCore.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/16299440.VqBOrIjuRa%40tjmaciei-mobl1.
Andrey Semashev
2016-12-30 01:33:15 UTC
Permalink
Post by Thiago Macieira
Em quinta-feira, 29 de dezembro de 2016, às 20:18:28 BRST, Andrey Semashev
Post by Andrey Semashev
It is, because that is exactly how std::string interface became bloated.
Consider the bunch of find*/rfind/compare member functions that are
currently present in std::string interface but could perfectly be
standalone. Some of them, in fact, already exist as standalone generic
algorithms, which a decent compiler will optimize to the same degree as
the dedicated member functions.
And yet hardly any compiler will optimise as well as the dedicated copies of
those functions that exist for QString inside QtCore.
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6CHhBj-N2Yg7K7XxVXE%2BNT62ATbk7eYJZqRP6RhyMfd6Q%40mail.gmail.com.
Nicol Bolas
2016-12-30 05:23:58 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 20:18:28 BRST, Andrey
Semashev
Post by Andrey Semashev
It is, because that is exactly how std::string interface became
bloated.
Post by Andrey Semashev
Consider the bunch of find*/rfind/compare member functions that are
currently present in std::string interface but could perfectly be
standalone. Some of them, in fact, already exist as standalone generic
algorithms, which a decent compiler will optimize to the same degree as
the dedicated member functions.
And yet hardly any compiler will optimise as well as the dedicated
copies of
those functions that exist for QString inside QtCore.
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
Here's a better question: so what if it doesn't?

I'm in favor of QOI when it comes to algorithms. But at the end of the day,
it costs me as a user *nothing* to have both `std::find` and
`basic_string::find`. Does it hurt my program in any way that I could have
used `std::find` instead of the member function? No. Does it make my
program in any way confusing? No. Does it make my program run any slower?
No. It doesn't even make my executable bigger, since either way, it'll
compile down to an inlined function.

Then so long as there are genuine benefits to the member function version
(like being able to take integer indices), what's the big deal?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1258ca2e-1149-4381-9e72-e86a8d05a179%40isocpp.org.
Andrey Semashev
2016-12-30 10:08:11 UTC
Permalink
Post by Andrey Semashev
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
Here's a better question: so what if it doesn't?
I'm in favor of QOI when it comes to algorithms. But at the end of the
day, it costs me as a user /nothing/ to have both `std::find` and
`basic_string::find`. Does it hurt my program in any way that I could
have used `std::find` instead of the member function? No. Does it make
my program in any way confusing? No. Does it make my program run any
slower? No. It doesn't even make my executable bigger, since either way,
it'll compile down to an inlined function.
Then so long as there are genuine benefits to the member function
version (like being able to take integer indices), what's the big deal?
It affects interface conciseness. As a user you have to learn what those
functions do, why they are there, and when to use them and not the
standalone algorithms. Frankly, I did not find a definitive answer to
these questions myself after years of programming practice.

Index-based interface of these functions is not an advantage that
warrants their existence because you can trivially convert indices into
iterators yourself (and I'm sure these functions do that internally
anyway). Yet that interface requires std::string::npos, a special magic
number, which one has to check for everywhere, including those member
functions. You may argue that the number is named and that it's probably
((size_t)-1) so that you'll never have a string that large, and checking
for it is not expensive, but still it rubs me the wrong way every time I
see it. So what is the reason to use the index-based interface then?

There is also a cost if you want to implement a class that mimics
std::string. I've done that a few times, and those member functions do
add complexity to the task.

There is obviously a cost for standard library implementers and standard
writers and committee.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/7b451816-d2fc-ef94-0e5e-9d23215f35ce%40gmail.com.
Andrey Semashev
2016-12-30 10:13:47 UTC
Permalink
Post by Andrey Semashev
Post by Andrey Semashev
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
Here's a better question: so what if it doesn't?
I'm in favor of QOI when it comes to algorithms. But at the end of the
day, it costs me as a user /nothing/ to have both `std::find` and
`basic_string::find`. Does it hurt my program in any way that I could
have used `std::find` instead of the member function? No. Does it make
my program in any way confusing? No. Does it make my program run any
slower? No. It doesn't even make my executable bigger, since either way,
it'll compile down to an inlined function.
Then so long as there are genuine benefits to the member function
version (like being able to take integer indices), what's the big deal?
It affects interface conciseness. As a user you have to learn what those
functions do, why they are there, and when to use them and not the
standalone algorithms. Frankly, I did not find a definitive answer to
these questions myself after years of programming practice.
Index-based interface of these functions is not an advantage that
warrants their existence because you can trivially convert indices into
iterators yourself (and I'm sure these functions do that internally
anyway). Yet that interface requires std::string::npos, a special magic
number, which one has to check for everywhere, including those member
functions. You may argue that the number is named and that it's probably
((size_t)-1) so that you'll never have a string that large, and checking
for it is not expensive, but still it rubs me the wrong way every time I
see it. So what is the reason to use the index-based interface then?
There is also a cost if you want to implement a class that mimics
std::string. I've done that a few times, and those member functions do
add complexity to the task.
I'll add that now the standard library did that once as well, in
std::string_view.
Post by Andrey Semashev
There is obviously a cost for standard library implementers and standard
writers and committee.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/6ecd63a7-7725-efa6-48b2-4a0de57eb12e%40gmail.com.
Thiago Macieira
2016-12-30 12:16:58 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 13:08:11 BRST, Andrey Semashev
Post by Andrey Semashev
It affects interface conciseness. As a user you have to learn what those
functions do, why they are there, and when to use them and not the
standalone algorithms. Frankly, I did not find a definitive answer to
these questions myself after years of programming practice.
A function called "find" somewhere does the same as another function called
"find" elsewhere. Naming matters. So just name like functions likewise, and
unlike functions differently.

This way, the learning carries from one place to the next.

Also, I don't subscribe to the concise-interface paradigm as much as you (and
I guess Bjarne) do. I don't think a date class needs to have a
"find_next_friday" function, but I do think useful functions should be added.
Post by Andrey Semashev
Index-based interface of these functions is not an advantage that
warrants their existence because you can trivially convert indices into
iterators yourself (and I'm sure these functions do that internally
anyway). Yet that interface requires std::string::npos, a special magic
number, which one has to check for everywhere, including those member
functions. You may argue that the number is named and that it's probably
((size_t)-1) so that you'll never have a string that large, and checking
for it is not expensive, but still it rubs me the wrong way every time I
see it. So what is the reason to use the index-based interface then?
Because people are used to it and expect it. There's a self-reinfocing cycle
here: people learnt it, so they use that technique in their code; which makes
new developers learn it too, then use it again and again.

Breaking the cycle is possible, but you're going to cause grief to people who
are used to and expect that technique.
Post by Andrey Semashev
There is also a cost if you want to implement a class that mimics
std::string. I've done that a few times, and those member functions do
add complexity to the task.
True, but that's not an issue we have to concern ourselves with. That happens
very infrequently, it's always done by experts, and if you want to implement a
very core class, you know you have an uphill battle.
Post by Andrey Semashev
There is obviously a cost for standard library implementers and standard
writers and committee.
True, but like the reimplementation, it happens once only, thereafter only
maintenance. But it may save hundreds of hours of work down the line by other
people.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3451251.E3cW5T57FE%40tjmaciei-mobl1.
Nicol Bolas
2016-12-30 18:16:26 UTC
Permalink
Post by Andrey Semashev
On Thursday, December 29, 2016 at 8:33:20 PM UTC-5, Andrey Semashev
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it
couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do
something
better than that?
Here's a better question: so what if it doesn't?
I'm in favor of QOI when it comes to algorithms. But at the end of the
day, it costs me as a user /nothing/ to have both `std::find` and
`basic_string::find`. Does it hurt my program in any way that I could
have used `std::find` instead of the member function? No. Does it make
my program in any way confusing? No. Does it make my program run any
slower? No. It doesn't even make my executable bigger, since either way,
it'll compile down to an inlined function.
Then so long as there are genuine benefits to the member function
version (like being able to take integer indices), what's the big deal?
It affects interface conciseness. As a user you have to learn what those
functions do, why they are there, and when to use them and not the
standalone algorithms. Frankly, I did not find a definitive answer to
these questions myself after years of programming practice.
Index-based interface of these functions is not an advantage that
warrants their existence because you can trivially convert indices into
iterators yourself (and I'm sure these functions do that internally
anyway). Yet that interface requires std::string::npos, a special magic
number, which one has to check for everywhere, including those member
functions. You may argue that the number is named and that it's probably
((size_t)-1) so that you'll never have a string that large, and checking
for it is not expensive, but still it rubs me the wrong way every time I
see it. So what is the reason to use the index-based interface then?
Interface conciseness is less important than overall readability. Consider
a relatively simple programming task. You're given two strings. You want to
find the last instance of string B within string A, then generate a string
that consists of all characters *before* the instance you found.

This is what the implementation based on generic algorithms and iterators
looks like:

std::string generic(std::string look, std::string pattern)
{
auto loc = std::search(look.rbegin(), look.rend(),
pattern.rbegin(), pattern.rend());

if(loc != look.rend())
{
loc += pattern.size();
return std::string(look.begin(), loc.base());
}
else
return std::string{};
}

Understanding this code requires being well versed in how reverse iterators
work. Two particularly non-obvious things I had to do were: 1) reversing
the *pattern* as well as the string being searched and 2) offsetting `loc`
by the pattern's size, since that's not the actual location.

This is what the index&member function version looks like:

std::string member(std::string look, std::string pattern)
{
auto loc = look.rfind(pattern);

if(loc != std::string::npos)
{
return look.substr(0, loc);
}

return std::string{};
}

That's much shorter and more easily understood. There's no need to reverse
the pattern's range, nor the mysterious offset. The only thing that might
be at all confusing is the test against `npos`. But that's ultimately no
different from testing against `look.rend()`.

As for compiler optimizations, here are the compiled results for GCC 7,
under -O3 <https://godbolt.org/g/xDFhgs>. It seems to me that `member` is
much shorter in assembly than `generic`, requiring a lot fewer jumps and
the like. So your "the compiler can sort it out" argument seems to not be
true in this case.

There is also a cost if you want to implement a class that mimics
Post by Andrey Semashev
std::string. I've done that a few times, and those member functions do
add complexity to the task.
There is obviously a cost for standard library implementers and standard
writers and committee.
So you're saying that we should avoid good APIs because they're hard to get
through committee? That sounds like a problem with the committee process,
not with the API.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/f69bf759-012d-4524-a3f4-e59a6d9f17ee%40isocpp.org.
Greg Marr
2016-12-31 03:18:40 UTC
Permalink
Post by Nicol Bolas
Interface conciseness is less important than overall readability. Consider
a relatively simple programming task. You're given two strings. You want to
find the last instance of string B within string A, then generate a string
that consists of all characters *before* the instance you found.
This is what the implementation based on generic algorithms and iterators
std::string generic(std::string look, std::string pattern)
{
auto loc = std::search(look.rbegin(), look.rend(),
pattern.rbegin(), pattern.rend());
if(loc != look.rend())
{
loc += pattern.size();
return std::string(look.begin(), loc.base());
}
else
return std::string{};
}
Understanding this code requires being well versed in how reverse
iterators work. Two particularly non-obvious things I had to do were: 1)
reversing the *pattern* as well as the string being searched and 2)
offsetting `loc` by the pattern's size, since that's not the actual
location.
std::string member(std::string look, std::string pattern)
{
auto loc = look.rfind(pattern);
if(loc != std::string::npos)
{
return look.substr(0, loc);
}
return std::string{};
}
That's much shorter and more easily understood. There's no need to reverse
the pattern's range, nor the mysterious offset. The only thing that might
be at all confusing is the test against `npos`. But that's ultimately no
different from testing against `look.rend()`.
Seems to me that's only because you chose a sub-optimal algorithm. You
want the one that matches string::rfind, but you used the one that matches
string::find instead, so you had to add extra code to make it work right.
Try this:

std::string generic(std::string look, std::string pattern)
{
auto loc = std::find_end(look.begin(), look.end(),
pattern.begin(), pattern.end());

if(loc != look.end())
{
return std::string(look.begin(), loc);
}

return std::string{};
}

or with Ranges:

std::string generic(std::string look, std::string pattern)
{
auto loc = std::find_end(look, pattern);

if(loc != look.end())
{
return std::string(look.begin(), loc);
}

return std::string{};
}

The generated assembly here looks to be smaller than search, but not as
small as rfind.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/bd2bc535-5dd7-479b-8062-fa7a082d907a%40isocpp.org.
Nicol Bolas
2016-12-31 16:15:05 UTC
Permalink
Post by Greg Marr
Post by Nicol Bolas
Interface conciseness is less important than overall readability.
Consider a relatively simple programming task. You're given two strings.
You want to find the last instance of string B within string A, then
generate a string that consists of all characters *before* the instance
you found.
This is what the implementation based on generic algorithms and iterators
std::string generic(std::string look, std::string pattern)
{
auto loc = std::search(look.rbegin(), look.rend(),
pattern.rbegin(), pattern.rend());
if(loc != look.rend())
{
loc += pattern.size();
return std::string(look.begin(), loc.base());
}
else
return std::string{};
}
Understanding this code requires being well versed in how reverse
iterators work. Two particularly non-obvious things I had to do were: 1)
reversing the *pattern* as well as the string being searched and 2)
offsetting `loc` by the pattern's size, since that's not the actual
location.
std::string member(std::string look, std::string pattern)
{
auto loc = look.rfind(pattern);
if(loc != std::string::npos)
{
return look.substr(0, loc);
}
return std::string{};
}
That's much shorter and more easily understood. There's no need to
reverse the pattern's range, nor the mysterious offset. The only thing that
might be at all confusing is the test against `npos`. But that's ultimately
no different from testing against `look.rend()`.
Seems to me that's only because you chose a sub-optimal algorithm. You
want the one that matches string::rfind, but you used the one that matches
string::find instead, so you had to add extra code to make it work right.
This actually emphasizes one of the problems with having a gigantic
algorithms library: the difficulty of finding appropriate operations.

I have been using C++ for decades now. And yet until you posted that, I had
never even *heard of* `std::find_end`. It's been there for nearly two
decades, and yet, I never knew it existed.

If people don't know an algorithm exists, it can't be used.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/85b9b92d-ae88-4b48-8f5c-e2e2cdb853ae%40isocpp.org.
Ville Voutilainen
2016-12-31 16:48:24 UTC
Permalink
Post by Nicol Bolas
Post by Greg Marr
Seems to me that's only because you chose a sub-optimal algorithm. You
want the one that matches string::rfind, but you used the one that matches
string::find instead, so you had to add extra code to make it work right.
This actually emphasizes one of the problems with having a gigantic
algorithms library: the difficulty of finding appropriate operations.
I have been using C++ for decades now. And yet until you posted that, I had
never even heard of `std::find_end`. It's been there for nearly two decades,
and yet, I never knew it existed.
If people don't know an algorithm exists, it can't be used.
Right, but I'm not convinced people are any wiser when they don't know
a member function exists.
In a completely anecdotal fashion, I tend to separate "what tools do I
have to solve this problem?" from "let's bang on the
keyboard and see what we'll find". Sometimes it's said that member
functions are easier to find
for IDEs and tools like intellisense, but when I have done the "what
tools?" phase, I tend to
turn the completion tools off if they give me a single false positive,
because they end up slowing
me down when I know what I intend to write. To each their own, I guess.

But hey, what proposal are we discussing, again? :) After all this
rather philosophical discussion,
I'm not quite sure what we're looking at. ;)
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAFk2RUbN3Fg6NLYpzk29%3DU5Cn6kP%2B14Q3VL9H_bz5%3DXteriwkg%40mail.gmail.com.
o***@join.cc
2017-01-13 16:10:33 UTC
Permalink
Post by Ville Voutilainen
But hey, what proposal are we discussing, again? :) After all this
rather philosophical discussion,
I'm not quite sure what we're looking at. ;)
I'm glad you asked. :D

append(s, "A", "B", 42); with at least support for string but perhaps more
general support.

It could call append(s, v) for each argument, providing a simple
customization support.
I'm not sure how to combine customization and optimal performance though.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/550410ae-e702-4218-b366-eacb9773ff72%40isocpp.org.
Andrey Semashev
2017-01-01 00:08:20 UTC
Permalink
Post by Nicol Bolas
Interface conciseness is less important than overall readability.
Consider a relatively simple programming task. You're given two strings.
You want to find the last instance of string B within string A, then
generate a string that consists of all characters /before/ the instance
you found.
This is what the implementation based on generic algorithms and
|
std::stringgeneric(std::stringlook,std::stringpattern)
{
autoloc =std::search(|look|.rbegin(),|look|.rend(),
pattern.rbegin(),pattern.rend());
if(loc !=|look|.rend())
{
loc +=pattern.size();
returnstd::string(|look|.begin(),loc.base());
}
else
returnstd::string{};
}
|
|
std::stringmember(std::stringlook,std::stringpattern)
{
autoloc =look.rfind(pattern);
if(loc !=std::string::npos)
{
returnlook.substr(0,loc);
}
returnstd::string{};
}
|
Well, if the `std::string::rfind` algorithm was standalone, I believe
the code would've been just as readable:

template< typename Iterator1, typename Iterator2 >
Iterator1 search_last(Iterator1 begin, Iterator1 end,
Iterator2 needle_begin, Iterator2 needle_end);

std::string generic2(std::string look, std::string pattern)
{
auto it = search_last(look.begin(), look.end(),
pattern.begin(), pattern.end());

if (it != look.end())
return std::string(look.begin(), it);

return std::string{};
}

That `search_last` algorithm could've been a generic algorithm with
specialized versions for optimal performance (if the compiler doesn't
already do the good job). So once again, index-based interface (I mean,
the choice of indices over iterators by itself), or algorithms being
`std::string` members doesn't make the code any clearer.

I can understand someone might be used to indices as a concept, but
really, most of the standard library is built around iterators; you'd
expect anyone more or less familiar with C++ should be used to iterators
in no less degree.
Post by Nicol Bolas
As for compiler optimizations, here are the compiled results for GCC 7,
under -O3 <https://godbolt.org/g/xDFhgs>. It seems to me that `member`
is much shorter in assembly than `generic`, requiring a lot fewer jumps
and the like. So your "the compiler can sort it out" argument seems to
not be true in this case.
Ok, fair enough, the compiler is not good at optimizing reverse
iteration. Let the library provide an optimized implementation of that
algorithm then. Let it be reusable in classes other than `std::string`.
In that case there is no need to pile it in `std::string` interface.
More concise `std::string` interface, more reusable algorithms, everyone
wins.

I know it's too late for the existing algorithms in `std::string`. I
just don't want it to get worse.
Post by Nicol Bolas
There is also a cost if you want to implement a class that mimics
std::string. I've done that a few times, and those member functions do
add complexity to the task.
There is obviously a cost for standard library implementers and standard
writers and committee.
So you're saying that we should avoid good APIs because they're hard to
get through committee? That sounds like a problem with the committee
process, not with the API.
I'm saying poorly designed API that imposes coupling and additional
dependencies adds a cost both on users who want to use or implement that
API and the committee which has to maintain evolution of the library.

I'm not saying no new APIs should be added to the standard library.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/b975ab42-e233-7315-f331-43eeb36bbb1b%40gmail.com.
Thiago Macieira
2016-12-30 12:09:55 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 21:23:58 BRST, Nicol Bolas
Post by Nicol Bolas
I'm in favor of QOI when it comes to algorithms. But at the end of the day,
it costs me as a user *nothing* to have both `std::find` and
`basic_string::find`.
I agree with that, but not with your conclusions.
Post by Nicol Bolas
Does it hurt my program in any way that I could have
used `std::find` instead of the member function? No. Does it make my
program in any way confusing? No.
It could hurt, but it's probably not confusing.
Post by Nicol Bolas
Does it make my program run any slower? No.
It could, if they are optimised differently. If you use the generic function
that is optimised for amortising its set up cost over 1 kB of data on a 64-
byte data block instead of the function optimised for less than 128 bytes,
then your code will run slower.
Post by Nicol Bolas
It doesn't even make my executable bigger, since either way, it'll
compile down to an inlined function.
Uh... that makes it bigger, not smaller. Compiling to more inlined code always
makes it bigger. If you want to make it smaller, you have to call the same
out-of-line function.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1716747.Z4oPCp04dD%40tjmaciei-mobl1.
Thiago Macieira
2016-12-30 13:58:31 UTC
Permalink
Post by Thiago Macieira
Em quinta-feira, 29 de dezembro de 2016, às 21:23:58 BRST, Nicol Bolas
Post by Nicol Bolas
It doesn't even make my executable bigger, since either way, it'll
compile down to an inlined function.
Uh... that makes it bigger, not smaller. Compiling to more inlined code
always makes it bigger. If you want to make it smaller, you have to call
the same out-of-line function.
Not *always*. Inlined code can open up new opportunities for the
optimizer, and save lots of code compared to passing parameters and
calling out-of-line functions.
Right, I apologise for the generalisation.

But in this case it holds true: those functions aren't trivial, so inlining
them at every call place increases the code size, not descreases.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1631968.7G42dsh6pa%40tjmaciei-mobl1.
Nicol Bolas
2016-12-30 17:10:22 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 14:52:21 BRST, Bo Persson
Em quinta-feira, 29 de dezembro de 2016, às 21:23:58 BRST, Nicol Bolas
Post by Nicol Bolas
It doesn't even make my executable bigger, since either way, it'll
compile down to an inlined function.
Uh... that makes it bigger, not smaller. Compiling to more inlined
code
always makes it bigger. If you want to make it smaller, you have to
call
the same out-of-line function.
Not *always*. Inlined code can open up new opportunities for the
optimizer, and save lots of code compared to passing parameters and
calling out-of-line functions.
Right, I apologise for the generalisation.
But in this case it holds true: those functions aren't trivial, so inlining
them at every call place increases the code size, not descreases.
My point was that both implementations, the member version and free
function, would be inlined to the same code. So neither will be bigger
relative to the other.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/a8eda984-2e91-40d9-a1e7-35e0901d8362%40isocpp.org.
Thiago Macieira
2016-12-30 12:06:23 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 04:33:15 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
And yet hardly any compiler will optimise as well as the dedicated copies
of those functions that exist for QString inside QtCore.
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
Now try that for std::u16string and std::wstring.

And yes, it does something better than that because the libc functions are
optimised differently. memcmp is optimised for large data blocks, but most
strings are actually quite short, to the point that the necessary detection at
runtime to figure out the best strategy for long and short strings is enough
overhead.

By having a dedicated function somewhere, an implementation can provide an
out-of-line copy, hand-rolled for the use-cases.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3884640.PO38xQy1ar%40tjmaciei-mobl1.
Andrey Semashev
2016-12-30 13:07:45 UTC
Permalink
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 04:33:15 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
And yet hardly any compiler will optimise as well as the dedicated copies
of those functions that exist for QString inside QtCore.
I'm not familiar with Qt implementation, but I suspect it doesn't do
anything significantly more optimized than libc string functions. I
know at least gcc is able to convert std::copy/std::fill into
memcpy/memset calls when possible, and I see no reason why it couldn't
convert std::find/std::equal into memmem/memcmp. Does Qt do something
better than that?
Now try that for std::u16string and std::wstring.
The only problem is with memset, and it's partly mitigated by wmemset.
The other functions don't depend on character sizes.
Post by Thiago Macieira
And yes, it does something better than that because the libc functions are
optimised differently. memcmp is optimised for large data blocks, but most
strings are actually quite short, to the point that the necessary detection at
runtime to figure out the best strategy for long and short strings is enough
overhead.
By having a dedicated function somewhere, an implementation can provide an
out-of-line copy, hand-rolled for the use-cases.
Ok, I see. But is there a reason to have these optimized algorithms as
class members as opposed to free functions? Why limit their use to a
particular class? I mean, std::string_view and other user-defined
analogues of std::string (maybe even Qt included) could just make use of
the optimized string algorithms in the standard library, if they were
standalone and generic enough.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/ffd6dc3c-6d3b-f5b0-5a16-73aa9463ae92%40gmail.com.
Thiago Macieira
2016-12-30 13:21:08 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 16:07:45 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
Now try that for std::u16string and std::wstring.
The only problem is with memset, and it's partly mitigated by wmemset.
The other functions don't depend on character sizes.
Sure they do. memchr finds a single byte, not a word of 2 or 4 bytes. memcmp is
the same: it does a byte-by-byte comparison and returns a difference of the
first byte that compared differently. Except that the difference is very likely
incorrect for a 2-byte word on little-endian machines.

You can implement 2- and 4-byte string operations on top of the 1-byte libc
functions, but they won't be as efficient as the implementations doing direct 2-
and 4-byte ops.
Post by Andrey Semashev
Post by Thiago Macieira
By having a dedicated function somewhere, an implementation can provide an
out-of-line copy, hand-rolled for the use-cases.
Ok, I see. But is there a reason to have these optimized algorithms as
class members as opposed to free functions? Why limit their use to a
particular class? I mean, std::string_view and other user-defined
analogues of std::string (maybe even Qt included) could just make use of
the optimized string algorithms in the standard library, if they were
standalone and generic enough.
The question here is different. I'd like to have those methods accessible to
me without using std::string (though std::u16string_view would do nicely).

But I think they should be available as members for other reasons besides
efficiency.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1600353.4PXnUuzDvH%40tjmaciei-mobl1.
Andrey Semashev
2016-12-30 13:58:18 UTC
Permalink
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 16:07:45 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
Now try that for std::u16string and std::wstring.
The only problem is with memset, and it's partly mitigated by wmemset.
The other functions don't depend on character sizes.
Sure they do. memchr finds a single byte, not a word of 2 or 4 bytes.
I listed memmem, which searches an arbitrarily sized needle. On systems
where it is absent, it is very easy to emulate through memchr.
Post by Thiago Macieira
memcmp is
the same: it does a byte-by-byte comparison and returns a difference of the
first byte that compared differently.
No it doesn't. Its result is <0, 0 or >0, and not necessarilly the
difference.
Post by Thiago Macieira
Except that the difference is very likely
incorrect for a 2-byte word on little-endian machines.
Ah, right. We have wmemcmp then.
Post by Thiago Macieira
Post by Andrey Semashev
Ok, I see. But is there a reason to have these optimized algorithms as
class members as opposed to free functions? Why limit their use to a
particular class? I mean, std::string_view and other user-defined
analogues of std::string (maybe even Qt included) could just make use of
the optimized string algorithms in the standard library, if they were
standalone and generic enough.
The question here is different. I'd like to have those methods accessible to
me without using std::string (though std::u16string_view would do nicely).
But I think they should be available as members for other reasons besides
efficiency.
What are those reasons?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/210278e3-9a27-b7fb-833f-5b368406bb6e%40gmail.com.
Thiago Macieira
2016-12-30 14:46:04 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 16:07:45 BRST, Andrey Semashev
Post by Andrey Semashev
Post by Thiago Macieira
Now try that for std::u16string and std::wstring.
The only problem is with memset, and it's partly mitigated by wmemset.
The other functions don't depend on character sizes.
Sure they do. memchr finds a single byte, not a word of 2 or 4 bytes.
I listed memmem, which searches an arbitrarily sized needle. On systems
where it is absent, it is very easy to emulate through memchr.
But not as efficient if you're looking at words, not just needles.

Take the following array of 16-bit elements:
{ 0x0102, 0x0304, 0x0506, 0 }

On little-endian machines, that's the byte sequence

02 01 04 03 06 05 00 00

If you search for 0x401 with memmem, you're going to find it at byte index 1,
but that's not a valid match because it straddles the boundary of two 16-bit
elements.

If you searched for 0x404 with memchr, you'd search first for 0x04, which you'd
find at byte index 2, so it's valid, only to conclude that the next byte isn't
a match.
Post by Andrey Semashev
Post by Thiago Macieira
memcmp is
the same: it does a byte-by-byte comparison and returns a difference of the
first byte that compared differently.
No it doesn't. Its result is <0, 0 or >0, and not necessarilly the
difference.
Either way, it can be wrong. Take these two strings:

u"ABC\u0180" => 41 00 42 00 43 00 80 01
u"ABC\u027F" => 41 00 42 00 43 00 7F 02

A proper char16_t comparison should find that the first string is less than the
second. But a pure memcmp will find that byte index 6 differs and that 0x7F is
less than 0x80, so the second string is less than the first.

So memcmp won't cut it. You need a function that returns the pointer to or
index of the first byte that compared unequally, so that you can inspect the
word that contains that differing byte. There's no such function in libc.
Post by Andrey Semashev
Post by Thiago Macieira
Except that the difference is very likely
incorrect for a 2-byte word on little-endian machines.
Ah, right. We have wmemcmp then.
wchar_t is 4 bytes on Unix systems, so it won't help for char16_t. Where is is
2 bytes, it won't help for char32_t.
Post by Andrey Semashev
Post by Thiago Macieira
But I think they should be available as members for other reasons besides
efficiency.
What are those reasons?
I explained in another email: convenience and difference in philosophy. I do
believe a class should provide as members most of the common operations to be
done to its data. That's why QString has startsWith() and endsWith(), which
are trivially easy to implement.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1537490.zR3xCGS3bO%40tjmaciei-mobl1.
Olaf van der Spek
2016-12-30 16:56:00 UTC
Permalink
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey Semashev
I explained in another email: convenience and difference in philosophy. I do
believe a class should provide as members most of the common operations to be
done to its data. That's why QString has startsWith() and endsWith(), which
are trivially easy to implement.
However, if those are implemented as free functions taking something
like string_view, other string types would be able to take advantage
of them too...
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HNvqJiB9-ViHrXyTkk-qEN0nGs12Ty1DX%2BBJq0VSDva%3DA%40mail.gmail.com.
Nicol Bolas
2016-12-30 17:21:52 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey
Semashev
I explained in another email: convenience and difference in philosophy.
I do
believe a class should provide as members most of the common operations
to be
done to its data. That's why QString has startsWith() and endsWith(),
which
are trivially easy to implement.
However, if those are implemented as free functions taking something
like string_view, other string types would be able to take advantage
of them too...
Sure. But the problem with over-genericization is that, anyone who is not
steeped in the lore of all your generic algorithms can't figure out how to
do something simple.

`startsWith` is a perfectly fine and descriptive name, for a string member
function. The context and its parameters explain what it's doing. But the
non-member generic version couldn't be simply named `starts_with`. It would
have to be something like `matches_initial_sequence`. Far less descriptive.
Plus, there's the fact that it can conceptually work with any forward
range(s), which makes it less likely that someone looking for how to do
this test on strings will find it.

What's worse is that people could easily argue that we don't need
`matches_initial_sequence` at all. Once we have ranges, people can argue
that `matches_initial_sequence is equivalent to `std::equal(some_str |
initial_sequence(size(other_str)), other_str);` So do we really need such a
function? I could see people arguing that it's just added "bloat".
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/276e5471-0614-47ae-8b78-bef265751119%40isocpp.org.
Olaf van der Spek
2016-12-30 17:24:00 UTC
Permalink
Post by Nicol Bolas
Post by Olaf van der Spek
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey Semashev
I explained in another email: convenience and difference in philosophy. I do
believe a class should provide as members most of the common operations to be
done to its data. That's why QString has startsWith() and endsWith(), which
are trivially easy to implement.
However, if those are implemented as free functions taking something
like string_view, other string types would be able to take advantage
of them too...
Sure. But the problem with over-genericization is that, anyone who is not
steeped in the lore of all your generic algorithms can't figure out how to
do something simple.
`startsWith` is a perfectly fine and descriptive name, for a string member
function. The context and its parameters explain what it's doing. But the
non-member generic version couldn't be simply named `starts_with`. It would
Why not? Works fine for Boost:
http://www.boost.org/doc/libs/1_63_0/doc/html/boost/algorithm/starts_with.html
Post by Nicol Bolas
have to be something like `matches_initial_sequence`. Far less descriptive.
Plus, there's the fact that it can conceptually work with any forward
range(s), which makes it less likely that someone looking for how to do this
test on strings will find it.
What's worse is that people could easily argue that we don't need
`matches_initial_sequence` at all. Once we have ranges, people can argue
that `matches_initial_sequence is equivalent to `std::equal(some_str |
initial_sequence(size(other_str)), other_str);` So do we really need such a
function? I could see people arguing that it's just added "bloat".
Hehe
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HMQc0t0eWdzJ_YkNks5VFURHm5j_uUAzyV5FCudW8aJrA%40mail.gmail.com.
Thiago Macieira
2016-12-30 18:49:58 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 17:56:00 BRST, Olaf van der Spek
Post by Olaf van der Spek
Post by Thiago Macieira
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey Semashev
I explained in another email: convenience and difference in philosophy. I
do believe a class should provide as members most of the common
operations to be done to its data. That's why QString has startsWith()
and endsWith(), which are trivially easy to implement.
However, if those are implemented as free functions taking something
like string_view, other string types would be able to take advantage
of them too...
I don't see it that way. Like I said, they're trivially easy to implement, so
they're trivially easy to reimplement. They can be implemented multiple times,
one for each string-like class.

Or std::string could delegate everything to std::string_view on itself. This
achieves high code reuse.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1561722.hH4c4B51Y2%40tjmaciei-mobl1.
Nicol Bolas
2016-12-31 01:48:35 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 17:56:00 BRST, Olaf van der
Spek
Post by Olaf van der Spek
Em sexta-feira, 30 de dezembro de 2016, às 16:58:18 BRST, Andrey
Semashev
Post by Olaf van der Spek
I explained in another email: convenience and difference in
philosophy. I
Post by Olaf van der Spek
do believe a class should provide as members most of the common
operations to be done to its data. That's why QString has startsWith()
and endsWith(), which are trivially easy to implement.
However, if those are implemented as free functions taking something
like string_view, other string types would be able to take advantage
of them too...
I don't see it that way. Like I said, they're trivially easy to implement, so
they're trivially easy to reimplement. They can be implemented multiple times,
one for each string-like class.
Any one such function is relatively easy to implement. But when you have
*twenty* such functions, it starts being a significant task. *Especially*
once you want to start testing them all comprehensively.

This is precisely why even simple algorithms like `search` and `accumulate`
and so forth are not bound to a specific container. Yes, we could always
rewrite them. But what's the point of that, when we don't *have to*?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/73bb89f8-98ae-4a4c-8a62-fba16c70353b%40isocpp.org.
Thiago Macieira
2016-12-31 02:05:37 UTC
Permalink
Em sexta-feira, 30 de dezembro de 2016, às 17:48:35 BRST, Nicol Bolas
Post by Nicol Bolas
Post by Thiago Macieira
I don't see it that way. Like I said, they're trivially easy to implement, so
they're trivially easy to reimplement. They can be implemented multiple times,
one for each string-like class.
Any one such function is relatively easy to implement. But when you have
*twenty* such functions, it starts being a significant task. *Especially*
once you want to start testing them all comprehensively.
If they all call the same implementation, then you can test it once and it
would suffice. It is the exact same amount of testing necessary as if it were
only a free function.
Post by Nicol Bolas
This is precisely why even simple algorithms like `search` and `accumulate`
and so forth are not bound to a specific container. Yes, we could always
rewrite them. But what's the point of that, when we don't *have to*?
I'm not asking for all functions to be added to each and every class. Only
those that are used very often and would benefit from conciseness of code and
discoverability.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1849192.zPS9UT8Wn7%40tjmaciei-mobl1.
Thiago Macieira
2016-12-29 13:30:01 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 04:35:57 BRST, Victor Dyachenko
Post by Victor Dyachenko
"We have unusable tool. Nobody uses it including myself, but we have it!"
:-)
You misunderstand me. I don't use std::string in the first place. Therefore, I
have no need for std::stringstream.

Though I also don't use iostreams, except in Hello World applications.
Post by Victor Dyachenko
Why not fix iostreams instead?
Because it is not fixable by design. It tries to be everything, so any
implementation will be bloated. Dependency on the locales, which weights
more than 1MB per se, states everything (formatting parameters, flags,
etc), virtual calls, et al.
So dump it and start something new. I think we'll all benefit from it, since
most of us think its design is bloated. It came from the very depths of C++'s
origins, when the rule was to make *everything* polymorphic and overrideable.
We've learned a lot since then.

Anyway, I don't oppose having a formatting method for strings, outside of
iostreams or a replacement of it. I think we need it, even.

I just don't think we should combine that with concatenation.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1730640.yXFxZGszCK%40tjmaciei-mobl1.
Victor Dyachenko
2016-12-29 13:47:29 UTC
Permalink
Post by Thiago Macieira
I just don't think we should combine that with concatenation.
Makes sense. "<<" for non-character type, indeed, is the subset of the
concatenation feature. As for me, the ability to use "<<" instead of "+="
for char, const char * and std::string would be a good starting point.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8d8502f9-8429-495a-96ba-f49a478871d1%40isocpp.org.
Thiago Macieira
2016-12-29 14:36:03 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 05:47:29 BRST, Victor Dyachenko
Post by Victor Dyachenko
Post by Thiago Macieira
I just don't think we should combine that with concatenation.
Makes sense. "<<" for non-character type, indeed, is the subset of the
concatenation feature. As for me, the ability to use "<<" instead of "+="
for char, const char * and std::string would be a good starting point.
The left-most operand is always a class type, probably a template
instantiation of std::basic_string.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/1509317.BqkunrEfpk%40tjmaciei-mobl1.
Bengt Gustafsson
2016-12-29 16:14:55 UTC
Permalink
Note that deferring number formatting to existing functions reduces the
performance gain of this function: Each of the number formatter functions
will have to allocate a std::string or similar and return it before the
Append() function or similar gets hold of it. Thus we can never get down to
the one allocation solution envisioned. Furthermore, even to find out the
number of characters to allocate for a formatted number without actually
doing the formatting is very complicated, especially if it needs to heed
formatting options. Using a simplified formatted length functionality can
work but means that either there could be a significant over-allocation if
guesses are too high or significant amounts of double allocation if guesses
are too low.

Another solution would be to use a thread-local buffer to basically sprintf
into and then copy the now known length result into the string (causing one
allocation). But this can all be viewed as QOI level discussions. What is
important is that if number formatting is excluded from a feature like this
much of its appeal when it comes to allocation count reduction is lost.

So while this proposal without number capability does reduce allocations
there are many more allocations that can be gotten rid of if numbers are
allowed. An alternative would be to introduce another set of number
formatting functions which return objects that actually do the formatting
into a buffer, but can also tell how large that buffer needs to be.
Unfortunately this would add to C++ already large set of number formatting
functions and there would be no warning if older, less performant functions
are used inside an Append call by mistake.

I think, but have no proof, that it could be easier to get a customization
point similar to operator<<(ostream&, T) for additional types if an
operator chaining approach is implemented than for a Append function. As
someone suggested it should be possible to let this mechanism fall back to
a use of the ostream based operator if a specific overload is not
available. Allowing the stream modifiers such as hex etc. would be possible
but of course cumbersome and also perpetuate this old fashioned way of
specifying formatting. One thought that struck me was that maybe a library
implementer can turn the tables and implement shifting into an ostream in
terms of this new feature (reducing the number of reallocations of the
stream buffer) but this is probably not possible due to backward
compatiblibty issues.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/d1656ac9-a383-49cb-a1b0-14cbcf1415f5%40isocpp.org.
Olaf van der Spek
2016-12-29 16:47:57 UTC
Permalink
Post by Bengt Gustafsson
Note that deferring number formatting to existing functions reduces the
performance gain of this function: Each of the number formatter functions
will have to allocate a std::string or similar and return it before the
Append() function or similar gets hold of it. Thus we can never get down to
http://en.cppreference.com/w/cpp/utility/to_chars ;)
Post by Bengt Gustafsson
I think, but have no proof, that it could be easier to get a customization
point similar to operator<<(ostream&, T) for additional types if an operator
chaining approach is implemented than for a Append function. As someone
I think this isn't true, a customization point should be no trouble
with an append function..

Chaining looks good but wouldn't it make the single-allocation
optimization impossible?
--
Olaf
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAA7U3HMy%3DaGu-XxPaCc71_vW-COBwmVTdww1K7ErR_o0-R63zw%40mail.gmail.com.
Nicol Bolas
2016-12-29 17:03:10 UTC
Permalink
On Thursday, December 29, 2016 at 11:47:59 AM UTC-5, Olaf van der Spek
Post by Thiago Macieira
Post by Bengt Gustafsson
Note that deferring number formatting to existing functions reduces the
performance gain of this function: Each of the number formatter
functions
Post by Bengt Gustafsson
will have to allocate a std::string or similar and return it before the
Append() function or similar gets hold of it. Thus we can never get down
to
http://en.cppreference.com/w/cpp/utility/to_chars ;)
`to_chars` lacks the ability to tell you exactly how many characters a
conversion will take. That's going to make it really difficult to
pre-allocate the right amount of memory. At least, not without allocating a
lot of extra space.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/a9025bd7-5c13-45be-b7d2-6b88f799cf64%40isocpp.org.
Thiago Macieira
2016-12-29 22:24:13 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 09:03:10 BRST, Nicol Bolas
Post by Nicol Bolas
`to_chars` lacks the ability to tell you exactly how many characters a
conversion will take. That's going to make it really difficult to
pre-allocate the right amount of memory. At least, not without allocating a
lot of extra space.
Which you can't know until you perform the conversion anyway. So any code that
tries to single-allocate a formatting chain needs to estimate with the worst
case scenario (which, for 'f', could be in the hundreds of characters) and
then trim the string.

As a QoI bonus, implementations can improve the estimation by performing log10
on the value, or log2 and divide by 3 (log2 is extremely fast).
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/9448360.R0Cie9K37v%40tjmaciei-mobl1.
Nicol Bolas
2016-12-30 05:15:50 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 09:03:10 BRST, Nicol Bolas
Post by Nicol Bolas
`to_chars` lacks the ability to tell you exactly how many characters a
conversion will take. That's going to make it really difficult to
pre-allocate the right amount of memory. At least, not without
allocating a
Post by Nicol Bolas
lot of extra space.
Which you can't know until you perform the conversion anyway. So any code that
tries to single-allocate a formatting chain needs to estimate with the worst
case scenario (which, for 'f', could be in the hundreds of characters) and
then trim the string.
As a QoI bonus, implementations can improve the estimation by performing log10
on the value, or log2 and divide by 3 (log2 is extremely fast).
Don't misunderstand my point; it's a perfectly fine design. I was just
pointing out that this makes it essentially impossible to have both a
single-allocation append design and in-situ string formatting during append
operations.

So it's probably best to leave that alone.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/c71e6a33-bdfe-485e-b8fa-20cc1fc42136%40isocpp.org.
Thiago Macieira
2016-12-30 12:19:17 UTC
Permalink
Em quinta-feira, 29 de dezembro de 2016, às 21:15:50 BRST, Nicol Bolas
Post by Nicol Bolas
Don't misunderstand my point; it's a perfectly fine design. I was just
pointing out that this makes it essentially impossible to have both a
single-allocation append design and in-situ string formatting during append
operations.
So it's probably best to leave that alone.
That's why I've been saying that single-allocation appending and formatting
should be separate things.

But to enable the latter, the former should allow for a generator object that
can specify the maximum size on constant time, then inform what the actual
size later.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2064907.bserQZRNqv%40tjmaciei-mobl1.
m***@gmail.com
2017-01-18 22:10:47 UTC
Permalink
Hi all,
I haven't read all the post, but from what I've read, i'd say there is 2
subject here:
- 1st Concatenating several string easily and efficiently
For doing so what i've done on my side is to implement an algorithm (called
concat) able to do it.
It take as argument several strings in a container (can bean array, an
initializer_list, an vector, whatever..) and return the concatenated
string. The concatenation compute the required size so that it do only one
allocation.
Note that I also implemented an variant able to add a separator between
each std::dtring given in parameter.

- 2nd Be able to create a string from a predefined format
This is what printf is often used for. Having something similar in cpp
would be nice. But I think this will require adding something on the
std::string interface since I hardly see an algorithm being able to do that
because it is something quite string-specific .

Just my 2 cents,
Masse Nicolas.
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/99ace3a8-f706-48d2-a8dd-fe916713fe49%40isocpp.org.
G M
2017-01-19 00:28:08 UTC
Permalink
Post by m***@gmail.com
- 2nd Be able to create a string from a predefined format
This is what printf is often used for. Having something similar in cpp
would be nice. But I think this will require adding something on the
std::string interface since I hardly see an algorithm being able to do that
because it is something quite string-specific .
Just my 2 cents,
Masse Nicolas.
string exposes all that is needed to write or expand it and inquire it's
size.
vector also exposes all these facilities.

So format doesn't need to be a member function of string to use these
services to write data into a string,
And if it were a member of string, we'd not have the ability to use a
vector even though vector offers the same services as string to access the
internal array. so if vector offers the ability why shouldn't we use it.

The ability to write into a C array seems essential to me, both for writing
C compatible libraries, and to avoid memory allocation. Once you want to
avoid memory allocation, you may want to format into std::array also.
So supporting non memory allocating buffers is essential to my
mind. And you can't do that if your format function is tied into a string.

It seems to me a buffer concept would unify these and allow format to work
with anything and be no less or little less easy to use.

The cpp format has features that enable other containers to work. But I've
yet to look into cpp format in detail. But if formatting gets into the
standard as it surely must I'm sure it can make it even easier to use.

There may be good reasons not to go this route but I haven't heard anything
that convinces me yet that this isn't the way to go. Having only a single
string member function is a bad option to my mind. A member function of
string and other options too, sure, if there's a good reason for it, but
not the as the only option.

I note we have std::size() and std::data(). A question for the
Committee, if we also had std::is_fixed_size() and std::resize() could we
not make an std::format(container, ...) work with any of these types
without anything concepty being required?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAGxCow2ZX4LixZj3ZOZiqzAZjfS-KoAMu55OhMpCj8w2oymT-w%40mail.gmail.com.
Edward Catmur
2017-01-20 10:58:48 UTC
Permalink
Post by m***@gmail.com
- 2nd Be able to create a string from a predefined format
Post by m***@gmail.com
This is what printf is often used for. Having something similar in cpp
would be nice. But I think this will require adding something on the
std::string interface since I hardly see an algorithm being able to do that
because it is something quite string-specific .
Just my 2 cents,
Masse Nicolas.
string exposes all that is needed to write or expand it and inquire it's
size.
vector also exposes all these facilities.
So format doesn't need to be a member function of string to use these
services to write data into a string,
And if it were a member of string, we'd not have the ability to use a
vector even though vector offers the same services as string to access the
internal array. so if vector offers the ability why shouldn't we use it.
The ability to write into a C array seems essential to me, both
for writing C compatible libraries, and to avoid memory allocation. Once
you want to avoid memory allocation, you may want to format into std::array
also.
So supporting non memory allocating buffers is essential to my
mind. And you can't do that if your format function is tied into a string.
It seems to me a buffer concept would unify these and allow format to work
with anything and be no less or little less easy to use.
The cpp format has features that enable other containers to work. But I've
yet to look into cpp format in detail. But if formatting gets into the
standard as it surely must I'm sure it can make it even easier to use.
There may be good reasons not to go this route but I haven't heard
anything that convinces me yet that this isn't the way to go. Having only a
single string member function is a bad option to my mind. A member function
of string and other options too, sure, if there's a good reason for it, but
not the as the only option.
I note we have std::size() and std::data(). A question for the
Committee, if we also had std::is_fixed_size() and std::resize() could we
not make an std::format(container, ...) work with any of these types
without anything concepty being required?
Formatting into a container is overly restrictive; one may also want to be
able to format into a preallocated subrange (e.g. character positions 10-20
of a string) or to an output stream (console, file or network).

I think the more general concept is OutputRange, which could encapsulate
e.g. a back inserter for string or vector (expandable storage), a pair of
char pointers for an array, C array or subrange, or an ostreambuf_iterator
for an output stream.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/5ab16bc7-c497-494c-a12b-cf5740173a65%40isocpp.org.
g***@gmail.com
2017-01-20 12:22:35 UTC
Permalink
Post by Edward Catmur
Post by G M
I note we have std::size() and std::data(). A question for the
Committee, if we also had std::is_fixed_size() and std::resize() could we
not make an std::format(container, ...) work with any of these types
without anything concepty being required?
Formatting into a container is overly restrictive; one may also want to be
able to format into a preallocated subrange (e.g. character positions 10-20
of a string) or to an output stream (console, file or network).
I think the more general concept is OutputRange, which could encapsulate
e.g. a back inserter for string or vector (expandable storage), a pair of
char pointers for an array, C array or subrange, or an ostreambuf_iterator
for an output stream.
I agree. I was experimenting today with an interface where you create a
buffer object to format into that was also flushable.
A make buffer routine creates a wrapper object for a vector, array, c
array, stream etc. type but exposing a common interface
of data() and resize and flush methods and also can_resize() and can_flush.
and I passed that to my format routine.

The format routine just writes into the buffer extending the buffer if
possible until a maximum size has been reached,
It the buffer advertised it can be flushed it is flushed and the process is
repeated until the formatting is done.
If the buffer is not flushable or cannot be extended writing stops and
there is an error.
An std::vector would not advertise it can flush and it's flush method would
do nothing.
An std:;array would not advertise it can be extended and resizing would do
nothing.
A stream might advertise itself as being a buffer that can be extended and
flushed. or it may say its a fixed size but can be flushed.

vectors and strings and arrays are buffers, streams wrap buffers. so they
can all provide a unified buffer interface.

I tried a little test out of that and it seemed to be workable. I just had
one format routine and it was able to write to these types.
And also the buffer interface isn't format specific so any other type
of append or whatever routine could use it.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/27c61e0a-57ec-4d9d-b438-00b14518ca87%40isocpp.org.
o***@join.cc
2017-01-24 15:56:00 UTC
Permalink
Post by Edward Catmur
Formatting into a container is overly restrictive; one may also want to be
able to format into a preallocated subrange (e.g. character positions 10-20
of a string) or to an output stream (console, file or network).
I think the more general concept is OutputRange, which could encapsulate
e.g. a back inserter for string or vector (expandable storage), a pair of
char pointers for an array, C array or subrange, or an ostreambuf_iterator
for an output stream.
Sounds good, how would that interact with calls to reserve() and memcpy()
optimizations (for larger strings) though?

I don't think anything is stopping one from providing both variants.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/34c01da3-f647-491f-a836-7e185218262a%40isocpp.org.
'Edward Catmur' via ISO C++ Standard - Future Proposals
2017-01-28 18:39:38 UTC
Permalink
Post by Edward Catmur
Formatting into a container is overly restrictive; one may also want to be
able to format into a preallocated subrange (e.g. character positions 10-20
of a string) or to an output stream (console, file or network).
I think the more general concept is OutputRange, which could encapsulate
e.g. a back inserter for string or vector (expandable storage), a pair of
char pointers for an array, C array or subrange, or an ostreambuf_iterator
for an output stream.
Sounds good, how would that interact with calls to reserve() and memcpy()
optimizations (for larger strings) though?


That's a good point, and it's probably not enough (though theoretically
correct) to say that exponential growth makes reserve() moot. What I'd like
would be an OutputRange concept that allows creation of a contiguous memory
range, probably via the & and += operators, so giving output iterators more
of the hierarchy that input iterators demonstrate. Since a back inserter is
aware of its container target, it can know whether the container is
contiguous. For example:

auto it = back_inserter(s); // it models ContiguousOutputIterator
auto p = &*it; // returns s.data() + s.size()
it += 12; // performs s.resize(s.size() + 12, uninitialized)
memcpy(p, buf, 12);



I don't think anything is stopping one from providing both variants.
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAJnLdOahJN4j%3Df124UvFnxANt9t6Gz7FX%3DRWj-Or9j4iF6iCiw%40mail.gmail.com.
o***@gmail.com
2017-06-17 07:30:06 UTC
Permalink
Post by Olaf van der Spek
Hi,
One frequently needs to append stuff to strings, but the standard way
(s += "A" + "B" + to_string(42)) isn't optimal due to temporaries.
A variadic append() for std::string seems like the obvious solution.
It could support string_view, integers, maybe floats
but without formatting options..
It could even be extensible by calling append(s, t);
append(s, "A", "B", 42);
Would this be useful for the C++ std lib?
So I wrote some trivial functions and IMO it works nicely. The
implementation of append might not be as smart as it could be but it's good
enough for me and I really like the interface.

inline std::string& operator<<(std::string& a, std::string_view b)
{
return a += b;
}

inline std::string& operator<<(std::string& a, long long b)
{
return a += std::to_string(b);
}

inline void append(std::string&)
{
}

template<class T, class... A>
void append(std::string& s, const T& v, const A&... a)
{
s << v;
append(s, a...);
}

template<class... A>
std::string concat(const A&... a)
{
std::string s;
append(s, a...);
return s;
}
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3ce32531-babd-49d9-9607-cabd5a229df4%40isocpp.org.
Loading...