![]() |
Re: Pointer Arithmetic & UB
On 12/10/2012 12:14 PM, Edward Rutherford wrote:
> Hello > > Would the following code invoke an undefined behavior? "invoke" is a bad term to use for this purpose; it implies that there's some particular kind of behavior which is called "undefined behavior". You should ask "Would the following code have undefined behavior?" > char a[10]; > size_t i=20,j=15; > *(a+i-j)=42; > > It potentially constructs the invalid pointer a+i as an intermediate > value. But overall the access is inbounds. Yes, it does have undefined behavior. To make this seem more reasonable, consider a platform with the following real-world characteristics: there are registers specialized for storing addresses, and when an invalid address is stored in one of those registers, the current process aborts immediately, as a safety measure - it doesn't wait for the invalid address to be used. On such an implementation, a conforming implementation could translate your code so that 'a' is allocated near the end of a block of valid memory addresses, so that adding 20 to a gives an invalid address. It could generate instructions that load 'a' into an address register, then adds 'i' to it. Execution of those instructions would result in the register containing an invalid address, thus causing your program being aborted. |
Re: Pointer Arithmetic & UB
On 12/10/2012 12:55 PM, James Kuyper wrote:
> On 12/10/2012 12:14 PM, Edward Rutherford wrote: >> Hello >> >> Would the following code invoke an undefined behavior? > > "invoke" is a bad term to use for this purpose; it implies that there's > some particular kind of behavior which is called "undefined behavior". > You should ask "Would the following code have undefined behavior?" > >> char a[10]; >> size_t i=20,j=15; >> *(a+i-j)=42; >> >> It potentially constructs the invalid pointer a+i as an intermediate >> value. But overall the access is inbounds. > > Yes, it does have undefined behavior. > To make this seem more reasonable, consider a platform with the > following real-world characteristics: [...] A colleague who did some work on IBM's AS/400 (they've changed the name; I forget the new one) told me that simply trying to calculate an out-of-range pointer yielded a null pointer as a result. In the O.P.'s case, the intermediate steps would go something like a // OK so far a + i // too big: result = NULL NULL - j // not sure, but surely not good *(NULL - j) // really Really REALLY not good -- Eric Sosman esosman@comcast-dot-net.invalid |
Re: Pointer Arithmetic & UB
In article <ka59e6$mq7$1@dont-email.me>,
Eric Sosman <esosman@comcast-dot-net.invalid> wrote: > > A colleague who did some work on IBM's AS/400 (they've >changed the name; I forget the new one) told me that simply >trying to calculate an out-of-range pointer yielded a null >pointer as a result. Heh; learn something new every day. I never would have guessed that there was an actual architecture that would blow up with this construct. I assume that *(a+(i-j)) would be ok? -- -Ed Falk, falk@despams.r.us.com http://thespamdiaries.blogspot.com/ |
Re: Pointer Arithmetic & UB
Context:
char a[10]; size_t i=20,j=15; *(a+i-j)=42; On 12/10/2012 07:46 PM, Edward A. Falk wrote: .... > Heh; learn something new every day. I never would have guessed > that there was an actual architecture that would blow up with > this construct. > > I assume that *(a+(i-j)) would be ok? That should be safe for all conforming implementations of C. -- James Kuyper |
Re: Pointer Arithmetic & UB
Edward A. Falk wrote:
> I assume that *(a+(i-j)) would be ok? Please correct me if I am wrong, *(a+(i-j)) is strictly equivalent to a[i-j] (I find the latter clearer.) |
Re: Pointer Arithmetic & UB
On 12/10/2012 7:46 PM, Edward A. Falk wrote:
> In article <ka59e6$mq7$1@dont-email.me>, > Eric Sosman <esosman@comcast-dot-net.invalid> wrote: >> >> A colleague who did some work on IBM's AS/400 (they've >> changed the name; I forget the new one) told me that simply >> trying to calculate an out-of-range pointer yielded a null >> pointer as a result. > > Heh; learn something new every day. I never would have guessed > that there was an actual architecture that would blow up with > this construct. > > I assume that *(a+(i-j)) would be ok? Assuming `i-j' in range, yes. More on my colleague's tale: The code maintained a buffer in which items of various sizes accumulated, and which drained to disk when it got too full or too old. To decide whether a newly-offered item would fit, the code did something like itemEndPtr = nextBufferSpacePtr + itemSize; if (itemEndPtr < bufferStart + bufferSize) ... This worked as intended on all the other target systems, but failed on AS/400. I suspect the failure had something to do with the fact that the buffer was in a shared memory area, so stepping off the end also meant stepping outside of mapped address space; the problem might not have shown up with the `auto' array in your example. Still, perhaps a salutary lesson for the folks who still believe "All the world's a VAX^H^H^Hx86^H^H^Hx64^H^H^H..." -- Eric Sosman esosman@comcast-dot-net.invalid |
Re: Pointer Arithmetic & UB
Eric Sosman <esosman@comcast-dot-net.invalid> wrote:
(previous snip on pointer offsets) >>> A colleague who did some work on IBM's AS/400 (they've >>> changed the name; I forget the new one) told me that simply >>> trying to calculate an out-of-range pointer yielded a null >>> pointer as a result. >> Heh; learn something new every day. I never would have guessed >> that there was an actual architecture that would blow up with >> this construct. >> I assume that *(a+(i-j)) would be ok? > Assuming `i-j' in range, yes. > More on my colleague's tale: The code maintained a buffer > in which items of various sizes accumulated, and which drained > to disk when it got too full or too old. To decide whether a > newly-offered item would fit, the code did something like > itemEndPtr = nextBufferSpacePtr + itemSize; > if (itemEndPtr < bufferStart + bufferSize) ... Might fail in x86 (especially the 80286) in huge model. You can't load arbitrary data into segment selector registers in protected mode x86. In large mode, though, any offset isn't tested until an actual access is attempted. (The offset is in an ordinary register, such as AX.) In huge model, the system allocates a series of segments, such that the one can address through them in order. Still, I believe that the compilers are careful not to load a segment selector until needed to actually access something, maybe partly to allow such faulty C code. > This worked as intended on all the other target systems, but > failed on AS/400. I suspect the failure had something to do > with the fact that the buffer was in a shared memory area, so > stepping off the end also meant stepping outside of mapped > address space; the problem might not have shown up with the > `auto' array in your example. I believe that could happen with protected mode x86, too. > Still, perhaps a salutary lesson for the folks who still > believe "All the world's a VAX^H^H^Hx86^H^H^Hx64^H^H^H..." In the 80286 days, I had OS/2 1.0 and then 1.2 running, when just about everyone else was running MS-DOS. Instead of using malloc(), I would directly allocate segments from OS/2 of exactly the needed length. The hardware will then interrupt for an access, even read, either before or just after the end of the allocated space. (Unless the register wraps, and it is back into the allocated space again.) As usual in C, a 2D array was allocated as an array of pointers, each pointing to its own OS/2 allocated segment. Fortunately, the C compilers were always good at not using segment selector registers when copying pointers that might not point to anything. I don't know AS/400 that well, but there have been systems that relied on the compiler to generate the appropriate code, instead of run-time memory protection. I believe some Burroughs ALGOL systems worked that way. (Maybe still do.) As far as I know, they never had a C compiler, but if one did it might also have problems with out of range pointers. -- glen |
Re: Pointer Arithmetic & UB
On 12/10/2012 7:46 PM, Edward A. Falk wrote:
> In article <ka59e6$mq7$1@dont-email.me>, > Eric Sosman <esosman@comcast-dot-net.invalid> wrote: >> >> A colleague who did some work on IBM's AS/400 (they've >> changed the name; I forget the new one) told me that simply >> trying to calculate an out-of-range pointer yielded a null >> pointer as a result. > > Heh; learn something new every day. I never would have guessed > that there was an actual architecture that would blow up with > this construct. > > I assume that *(a+(i-j)) would be ok? No. There is no requirement that the value of "i-j" be calculated prior to adding it to "a". (Check the numerous threads here involving using parentheses to "fix" UB in things involving such constructs as "i + (i++)".) Operator precedence only guarantees how the expression is to be interpreted, not the actual order of evaluation. |
Re: Pointer Arithmetic & UB
On 12/10/2012 9:28 PM, James Kuyper wrote:
> Context: > char a[10]; > size_t i=20,j=15; > *(a+i-j)=42; > > On 12/10/2012 07:46 PM, Edward A. Falk wrote: > ... >> Heh; learn something new every day. I never would have guessed >> that there was an actual architecture that would blow up with >> this construct. >> >> I assume that *(a+(i-j)) would be ok? > > That should be safe for all conforming implementations of C. Are you sure? Does anything in the Standard *require* that "i-j" be evaluated prior to adding it to "a"? Haven't we had this discussion earlier, related to other forms of UB, with the questioner asking if adding parentheses would "fix" the problem? |
Re: Pointer Arithmetic & UB
Ken Brody <kenbrody@spamcop.net> writes:
> On 12/10/2012 7:46 PM, Edward A. Falk wrote: >> In article <ka59e6$mq7$1@dont-email.me>, >> Eric Sosman <esosman@comcast-dot-net.invalid> wrote: >>> >>> A colleague who did some work on IBM's AS/400 (they've >>> changed the name; I forget the new one) told me that simply >>> trying to calculate an out-of-range pointer yielded a null >>> pointer as a result. >> >> Heh; learn something new every day. I never would have guessed >> that there was an actual architecture that would blow up with >> this construct. >> >> I assume that *(a+(i-j)) would be ok? > > No. There is no requirement that the value of "i-j" be calculated prior to > adding it to "a". (Check the numerous threads here involving using > parentheses to "fix" UB in things involving such constructs as "i + (i++)".) > Operator precedence only guarantees how the expression is to be > interpreted, not the actual order of evaluation. True, but the expression `a+(i-j)` is evaluated *in the abstract machine* by subtracting j from i and then adding the result to a. A compiler is free to evaluate it by computing a+i and then subtracting j from the result *only* if it can guarantee that the result is the same, or if the canonical order has undefined behavior. `INT_MAX + (1 - 1)` has well defined behavior. `INT_MAX + 1 - 1` does not. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister" |
| All times are GMT. The time now is 01:59 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.