Pointer Math in C

C is a very confusing language.

The world is built on C, and some people are able to get very good at it, but I am not one of those people.

I pick it up for class, learn it, use it, appreciate it, and then forget it. Why would I want to write an application in C? I’m not a systems programmer, so I don’t touch it.

One side effect of this usage pattern is that I quickly forget how pointers and manual memory management work.

Pointers are the semantics of calloc and free are easy enough to refresh on. It takes me a little while to remember if I need to use *, &, ->, or . when working with pointers. That’s not a big deal though — again, I can refresh myself on the syntax rather easily.

One thing that does trip me up (and led to a very annoying bug in a barrier algorithm) is 2D arrays, or rather representing a 2D array as a pointer.

Here’s what I did.

int x;
int y;
int *array;

array = calloc(x * y, sizeof(int))

for (int i = 0; i < x; i++) {
    for (int j = 0; j < y; j++>) {
        item = array[i + j];
    }
}

Now, this seems somewhat reasonable at first. The problem is that there is going to be a collision. x = 0, y = 1 and x = 1, y = 0 will refer to the same slots in the array, which shouldn’t happen!

My next attempted was to change the array access to array[i * j]. This also doesn’t work. Consider when i = 0 or j = 0. Any multiplication by zero is zero, so these will all refer to the same slot.

The correct solution is rather simple. The access should be array[(i * y) + j]. Let’s prove this with an example.

With x = 2 and y = 3. These are the possible values of i and j:

i = 0, j = 0
i = 0, j = 1
i = 0, j = 2
i = 1, j = 0
i = 1, j = 1
i = 1, j = 2

Now, lets see which slot in the array each pair will fit into.

i = 0, j = 0; (0 * 3) + 0 = 0
i = 0, j = 1; (0 * 3) + 1 = 1
i = 0, j = 2; (0 * 3) + 2 = 2
i = 1, j = 0; (1 * 3) + 0 = 3
i = 1, j = 1; (1 * 3) + 1 = 4
i = 1, j = 2; (1 * 3) + 2 = 5

A unique index for each item! This is exactly what we wanted.

You should perform these accesses based on the usage patterns of your data. The example above uses the x value as the column and the y value as the row. Accesses to sequential x values will be faster than accesses to sequential y values because of locality.

Recent posts from blogs that I like

From the Commedia dell’Arte to Punch and Judy 1

Pierrot, Harlequin and other characters from the early professional theatre seen in paintings by Watteau, Goya and others.

via The Eclectic Light Company

Bloom filters

The original motivation for the creation of Bloom filters is efficient set membership, using a probabilistic approach to significantly reduce the time and space required to reject items that are not members in a certain set. The data structure was proposed by Burton Bloom in a 1970 paper titled "Spa...

via Eli Bendersky

Two publishers and three authors fail to understand what "vibe coding" means

via Simon Willison