Difference between input range and forward range

Tue Nov 10 10:54:28 PST 2015

On Tuesday, 10 November 2015 at 16:33:02 UTC, Ur at nuz wrote:
> I agree with these considerations. When I define non-copyable 
> range (with disabled this) lot of standard phobos functions 
> fails to compile instead of using *save* method. So logical 
> question is in which cases we should use plain old struct copy 
> or and when we should use *save* on forward ranges.
>
> Also good question is should we have input ranges copyable (or 
> for what types of ranges they can be copyable)? Good example is 
> network socket as input range, because we can't save the state 
> of socket stream and get consumed data again so as I thing 
> copying of such range looks meaningless (in my opinion). If we 
> want to pass it somewhere it's better pass it by reference.

Passing by reference really doesn't work with ranges. Consider 
that most range-based functions are lazy and wrap the range that 
they're given in a new range. e.g.

auto r = filter!pred(range);

or

auto r = map!func(range);

The range has to be copied for that to work. And even if you 
could make it so that the result of functions like map or filter 
referred to the original range by reference, their return value 
would not be returned by ref, so if a function required that its 
argument by passed by ref, then you couldn't chain it. So, 
requiring that ranges be passed by ref would pretty much kill 
function chaining.

> Also passing range somewhere to access it in two different 
> places simultaneously is also bad idea. The current state looks 
> like we have current approach with range postblit constructor 
> and +save+, because we have it for structs and it works somehow 
> (yet) for trivial cases. But we don't have clear intentions 
> about how it should really work.

It's mostly clear, but it isn't necessarily straightforward to 
get it right. If you want to duplicate a range, then you _must_ 
use save. Copying a range by assigning it to another range is not 
actually copying it per the range API. You pretty much have to 
consider it a move and consider the original unusable after the 
copy.

The problem is that for arrays and many of the common ranges, 
copying the range and calling save are semantically the same, so 
it's very easy to write code which assumes that behavior and then 
doesn't work with other types of changes. That's why it's 
critical to test range-based functions with a variety of ranges 
types - particularly reference types in addition to value types 
or dynamic arrays.

> Copying and passing ranges should also be specifyed as part of 
> range protocol, because it's very common use case and shouldn't 
> be ambigous.

The semantics of copying a range depend heavily on how a range is 
implemented and cannot be defined in the general case:

auto copy = orig;

Dynamic arrays and classes will function fundamentally 
differently, and with structs, there are a variety of different 
semantics that that copy could have. What it ultimately comes 
down to is that while the range API can require that the copy be 
in the exact same state that the original was in, it can't say 
anything about the state of the original after the copy. 
Well-behaved range-based code has to assume that once orig has 
been copied, it is unusable. If the code wants to actually get a 
duplicate of the range, then it will have to use save, and the 
semantics of that _are_ well-defined and do not depend on the 
type of the range.

> Also as far as range could be class object we must consider how 
> should they behave?

There's really nothing to consider here. It's known how they 
should behave. There's really only one way that they _can_ 
behave. One of the main reasons that save exists is because of 
classes. While copying a dynamic array or many struct types is 
equivalent to save, it _can't_ be equivalent with a class. When 
you consider that fact, the required behavior of ranges pretty 
much falls into place on its own. We may very well need to be far 
clearer about what those semantics are and how that affects best 
practices, but there really isn't much (if any) wiggle room in 
what the range API does and doesn't guarantee and how it should 
be used. The problem is whether it's _actually_ used that way.

If a range-based function is tested with a variety of range types 
- dynamic arrays, value types, reference types, etc. then it 
becomes clear very quickly when calls to save are required and 
how the function must be written to work for all of those range 
types. But far too often, range-based functions are tested with 
dynamic arrays and a few struct range types that wrap dynamic 
arrays, and bugs with regards to reference type ranges are not 
found. So, there's almost certainly a lot of range-based code out 
there that works fantastically with dynamic arrays but would fail 
miserably with a number of other range types.

For the most part, I think that it's pretty clear how ranges have 
to act and how they need to be used based on their API when you 
actually look at how the range API interacts with different types 
of ranges, but we often do not go much beyond dynamic arrays and 
miss out on some of the subtleties.

We really do need some good write-ups on ranges and their best 
practices. I've worked on that before but never managed to spend 
the time to finish it. Clearly, I need to fix that.

- Jonathan M Davis