Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Test if const_iterator may be dereferenced - with no direct accessto original vector.

Reply
Thread Tools

Test if const_iterator may be dereferenced - with no direct accessto original vector.

 
 
mathog
Guest
Posts: n/a
 
      05-01-2013
What does one do in this situation:

....
Glib::ustring::const_iterator icc;
....
icc = _spans[lastspan].input_stream_first_character;
....
// need a test here to see if the next line is safe
if(*icc){

In the textbook examples one has both the vector and the iterator, so
the test can be rolled together on one line like:

if(icc != avector.end() && *icc)

In this case it isn't entirely clear which vector that iterator is
referencing. (Because _spans hangs onto the iterator but does not
store the vector, at least not publicly.) It might (might!) be possible
to hunt the vector down, by chasing backwards through half a dozen
objects to find it, but why should that be necessary? Is there not in
C++ something like:

if(icc->dereferencable()){

?

This came up in a situation where an empty text span was embedded
between others with characters. So on the 3rd span (or whatever it was)
the value of icc was set to a non-dereferencable value from the get go,
that value having been stored there long ago and far away in the code.

Of course without the missing test the program segfaulted when it tried
to dereference the const_iterator for this empty span.

I suppose that the desired result could be accomplished with try/catch,
but wonder if C++ iterators do not in general have some method for doing
this.

Thank you,

David Mathog
 
Reply With Quote
 
 
 
 
Bart van Ingen Schenau
Guest
Posts: n/a
 
      05-02-2013
On Wed, 01 May 2013 13:57:36 -0700, mathog wrote:

> What does one do in this situation:
>
> ...
> Glib::ustring::const_iterator icc;
> ...
> icc = _spans[lastspan].input_stream_first_character;
> ...
> // need a test here to see if the next line is safe if(*icc){
>
> In the textbook examples one has both the vector and the iterator, so
> the test can be rolled together on one line like:
>
> if(icc != avector.end() && *icc)
>
> In this case it isn't entirely clear which vector that iterator is
> referencing. (Because _spans hangs onto the iterator but does not store
> the vector, at least not publicly.)


Within the concept of C++ iterators, you always need *two* iterators: one
to indicate the current position and another to indicate the end of the
range. And although it is common for the end of a range to coincide with
the end of a container, this is by no means part of the concept of
iterators.
For that reason, the common solution would be:

...
Glib::ustring::const_iterator icc, end;
...
icc = _spans[lastspan].input_stream_first_character;
end = _spans[lastspan].input_stream_end;
...
// need a test here to see if the next line is safe if(*icc){
if (icc != end && *icc) /* do something */

The important change here is that a span knows where it ends. For the
calling code, it does not matter if that end coincides with the end of a
vector, or if that end happens to be the start of the next span.

> It might (might!) be possible to
> hunt the vector down, by chasing backwards through half a dozen objects
> to find it, but why should that be necessary? Is there not in C++
> something like:
>
> if(icc->dereferencable()){
>
> ?


There are several problems with requiring such a function.
First of all, the function can't tell if the iterator is still within the
range it is meant to iterate over, because ranges are not required to end
on a non-dereferenceable iterator.
Secondly, iterators are meant to be lightweight objects. Not much more
than a pointer or a wrapper around one with knowledge how to access the
next element. As such, determining dereferenceability becomes as hard as
determining dereferencability for a plain pointer, which means
practically impossible.

>
> This came up in a situation where an empty text span was embedded
> between others with characters. So on the 3rd span (or whatever it was)
> the value of icc was set to a non-dereferencable value from the get go,
> that value having been stored there long ago and far away in the code.
>
> Of course without the missing test the program segfaulted when it tried
> to dereference the const_iterator for this empty span.
>
> I suppose that the desired result could be accomplished with try/catch,
> but wonder if C++ iterators do not in general have some method for doing
> this.


As the segfault was the result of undefined behaviour, try/catch would
not have reliably helped you.
The general method for checking if an iterator is still within range is
to test if it has not reached the end iterator for that range yet.

>
> Thank you,
>
> David Mathog


Bart van Ingen Schenau
 
Reply With Quote
 
 
 
 
Marcel Müller
Guest
Posts: n/a
 
      05-02-2013
On 02.05.13 09.56, Andy Champ wrote:
> If I then append to the string, so the buffer now contains
>
> "ABCdefghijklmnop"
>
> Without any change whatsoever to the iterator it has now become valid -
> it points at d.


no, this is undefined behavior. Changing a vector or string invalidates
all existing iterators of this instance. You must consider that the
append operation could require a reallocation.

However, your answer that you can't check whether an iterator is valid
and dereferencable is right. I think this has mainly be done for
performance reasons. In C++ iterators are intended to be very cheap to
copy. In many cases they are only one machine size word in size.

In other languages like Java iterators are heap objects. It doesn't
matter whether they are a few bytes larger or not.


Marcel
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-02-2013
On Thursday, 2 May 2013 08:56:10 UTC+1, Andy Champ wrote:
> On 01/05/2013 21:57, mathog wrote:


> > Is there not in C++ something like:


> > if(icc->dereferencable()){


> No, there isn't, and for good reasons.


The good reason is probably because there's no way of
implementing it, given that you need a second iterator to know
whether you're at the end or not. Every iterator I wrote before
STL came along supported something like this (usually
icc.isValid()).

> Inside a std::string there's usually a buffer containing the string. (I
> don't think there _has_ to be, but that's another matter (1) ). That
> string is a load of characters, usually bytes.


> Imagine I have an internal vector, which for efficiency the string code
> has initially allocated as 16 bytes even though it only contains "ABC".
> I'll use ? as a marker for "undefined". The bytes are then


> "ABC?????????????"


> An iterator to C can be de-referenced, but if you increment it you get
> one that cannot be de-referenced. There's nothing about that ? that


> marks it as something that can't be accessed. Without
> accessing the original collection there's nothing the iterator
> can use either - and for reasons I don't know the original STL
> design doesn't contain references from iterators into the
> collection (2). And if you _do_ de-reference it you'll just
> get whatever character happens to be in the first question
> mark.


With most modern implementations, you'll get an assertion
failure. (At least, this is the case with VC++ and g++.)

> If I then append to the string, so the buffer now contains


> "ABCdefghijklmnop"


> Without any change whatsoever to the iterator it has now become valid -
> it points at d.


What happens in this case is undefined behavior. I suspect,
however, that most implementations would miss that error
(supposing that capacity() had been larger than the new string).

> I can then set it to end(). Typically this will be an address one more
> than p. Again, without reference to the collection you can't tell if
> it's valid. And if you do de-reference it - well, you might get the byte
> that follows p. Or that page in the processor's memory space might not
> have been allocated, and you get an exception. So once more you are in
> the realms of undefined behaviour.


> (1) I just checked. In C++98 there's no requirement for there to be an
> internal buffer, but for C++11 there is!


> (2) But I can guess. Suppose the collection was on the heap, and was
> deleted? Suppose the iterator was re-pointed into a different
> collection? And if the collection was deleted, and another one of the
> same type created in the same heap location, what then?


In most implementations, iterators register with the container,
so that they can be marked as invalid in such cases.

--
James
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Passing dereferenced new pointer to reference paramter shanemh@gmail.com C++ 7 10-30-2006 11:56 AM
char cannot be dereferenced haig Java 15 01-11-2006 06:15 PM
char cannot be dereferenced matt Java 10 09-02-2005 01:59 PM
Pointer: why could a deleted pointer be dereferenced? B. Penn C++ 6 08-09-2004 10:27 PM
test test test test test test test Computer Support 2 07-02-2003 06:02 PM



Advertisments