Difference between input range and forward range
Jonathan M Davis via Digitalmars-d
digitalmars-d at puremagic.com
Tue Nov 10 10:54:28 PST 2015
On Tuesday, 10 November 2015 at 16:33:02 UTC, Ur at nuz wrote:
> I agree with these considerations. When I define non-copyable
> range (with disabled this) lot of standard phobos functions
> fails to compile instead of using *save* method. So logical
> question is in which cases we should use plain old struct copy
> or and when we should use *save* on forward ranges.
>
> Also good question is should we have input ranges copyable (or
> for what types of ranges they can be copyable)? Good example is
> network socket as input range, because we can't save the state
> of socket stream and get consumed data again so as I thing
> copying of such range looks meaningless (in my opinion). If we
> want to pass it somewhere it's better pass it by reference.
Passing by reference really doesn't work with ranges. Consider
that most range-based functions are lazy and wrap the range that
they're given in a new range. e.g.
auto r = filter!pred(range);
or
auto r = map!func(range);
The range has to be copied for that to work. And even if you
could make it so that the result of functions like map or filter
referred to the original range by reference, their return value
would not be returned by ref, so if a function required that its
argument by passed by ref, then you couldn't chain it. So,
requiring that ranges be passed by ref would pretty much kill
function chaining.
> Also passing range somewhere to access it in two different
> places simultaneously is also bad idea. The current state looks
> like we have current approach with range postblit constructor
> and +save+, because we have it for structs and it works somehow
> (yet) for trivial cases. But we don't have clear intentions
> about how it should really work.
It's mostly clear, but it isn't necessarily straightforward to
get it right. If you want to duplicate a range, then you _must_
use save. Copying a range by assigning it to another range is not
actually copying it per the range API. You pretty much have to
consider it a move and consider the original unusable after the
copy.
The problem is that for arrays and many of the common ranges,
copying the range and calling save are semantically the same, so
it's very easy to write code which assumes that behavior and then
doesn't work with other types of changes. That's why it's
critical to test range-based functions with a variety of ranges
types - particularly reference types in addition to value types
or dynamic arrays.
> Copying and passing ranges should also be specifyed as part of
> range protocol, because it's very common use case and shouldn't
> be ambigous.
The semantics of copying a range depend heavily on how a range is
implemented and cannot be defined in the general case:
auto copy = orig;
Dynamic arrays and classes will function fundamentally
differently, and with structs, there are a variety of different
semantics that that copy could have. What it ultimately comes
down to is that while the range API can require that the copy be
in the exact same state that the original was in, it can't say
anything about the state of the original after the copy.
Well-behaved range-based code has to assume that once orig has
been copied, it is unusable. If the code wants to actually get a
duplicate of the range, then it will have to use save, and the
semantics of that _are_ well-defined and do not depend on the
type of the range.
> Also as far as range could be class object we must consider how
> should they behave?
There's really nothing to consider here. It's known how they
should behave. There's really only one way that they _can_
behave. One of the main reasons that save exists is because of
classes. While copying a dynamic array or many struct types is
equivalent to save, it _can't_ be equivalent with a class. When
you consider that fact, the required behavior of ranges pretty
much falls into place on its own. We may very well need to be far
clearer about what those semantics are and how that affects best
practices, but there really isn't much (if any) wiggle room in
what the range API does and doesn't guarantee and how it should
be used. The problem is whether it's _actually_ used that way.
If a range-based function is tested with a variety of range types
- dynamic arrays, value types, reference types, etc. then it
becomes clear very quickly when calls to save are required and
how the function must be written to work for all of those range
types. But far too often, range-based functions are tested with
dynamic arrays and a few struct range types that wrap dynamic
arrays, and bugs with regards to reference type ranges are not
found. So, there's almost certainly a lot of range-based code out
there that works fantastically with dynamic arrays but would fail
miserably with a number of other range types.
For the most part, I think that it's pretty clear how ranges have
to act and how they need to be used based on their API when you
actually look at how the range API interacts with different types
of ranges, but we often do not go much beyond dynamic arrays and
miss out on some of the subtleties.
We really do need some good write-ups on ranges and their best
practices. I've worked on that before but never managed to spend
the time to finish it. Clearly, I need to fix that.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list