> To get the address of a particular item, this layout naturally leads to the formula "base address + index * element size", with "index" being 0-based. If you want to expose other indexing schemes in your language, you'll have to add more logic to convert the user-visible index back to 0-based before you can obtain the address.
The easy way to do that is by shifting the base address. In pseudo-C (I think that computing the b pointer is non-conforming, even if the code never tries to access b[0], but compilers can do this without problems):
int a[100]; // array of 100 integers, zero-based
int * b = a - 1; // array of 100 integers, one-based
Things only get costly when you want to check array bounds or when you have multi-dimensional arrays. There also may be non-standard architectures where this kind of stuff isn’t possible.
In this case, b is a pointer that does not point to a valid address. You can't memset(b, 0, 100), or do any of the other regular pointer things with b.
It feels like you're introducing a million edge cases.
This is undefined behaviour, as a pointer is only allowed to point to valid addresses (as well as the index of the last element + 1, but it can't be dereferenced).
I think, in theory, it would work regardless of the starting address. As long as you don't try to access the invalid address (which you wouldn't assuming that it's starting in the index 1, you would always be accessing the first valid address)
“In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, _provided_they_exist”
[…]
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; _otherwise,_the_behavior_is_undefined.
In this case, the minus-one-th element doesn’t exist, so the expression
int * b = a - 1;
triggers undefined behavior.
I think some compilers use this in practice to produce faster code (that, often, will not do what the programmer expects it to do). Start reading at https://stackoverflow.com/questions/56360316/c-standard-rega... if you’re sure they don’t. I expect that will change your opinion.
That is a description of the commonly agreed upon definition of the C abstract machine and language semantics. You could simply define the language another way with regards to this behaviour.
Not that I would want to do it, I think zero-based addressing is not very taxing for the convenience of being closer to how we think of memory addressing.
The easy way to do that is by shifting the base address. In pseudo-C (I think that computing the b pointer is non-conforming, even if the code never tries to access b[0], but compilers can do this without problems):
Things only get costly when you want to check array bounds or when you have multi-dimensional arrays. There also may be non-standard architectures where this kind of stuff isn’t possible.