Lecture – C Pointer Variables

Ryan Robucci

Lecture – C Pointer Variables

Opening Discussion on Use of Pointers

Consider a scenario in main where some data has been captured, but it must be converted from signed-6 bit to signed 8-bit

assembly code loop:

byte dataHigh [500]; 
byte dataLow [500]; 
... 
for(i=0;i<500;i++) {
    dataLow [i]=PINA;
    dataHigh [i]=PINB;
    delay_ms(1);
}

500 data points stored in memory
want to send to a function for conversion
what are the mechnisms for passing the values and collecting the results?
- Registers?
  - too many vaules
- Stack?
  1. the caller pushes 500 values to the stack
  2. the function process them and then push all 500 values back to the stack
  3. then the caller pop all 500 values from the stack?
  This is not a good approach, this represents a thousand or more unnecessary moves of data to and from the stack.
- What if at the time of the call, we could just point the function to the place in memory where the data is stored and have it access and manipulate it directly? We can do this with variables called pointer variables in C for storing memory addresses and referencing memory locations.

Use of pointers:

Rather than copying all the data, the location/ address of the data can be copied to/from the function and the function can access it directly.
How to point to 500 data points using one variable?

One way is to store the 500 data points in a contiguous block of memory. Then, the only information need is

Information required to use a contiguous data blob representing an array:

the address of the first element and
the number of bytes (or bits) to advance in memory to find the next element
- and know how to advance by this increment in the given archetecture/platform
agreed upon interpretation of the bits of element

Arrays are set up this way in C. In C, a simple data array is stored in contiguous block of memory (so far as the program can tell since any memory paging is abstracted).

The location of the array is tracked using the address of the first element, which is stored in the array variable itself.
The type of the elements (int, char, float, etc…) defines the number of bytes to move forward by to find subsequent elements as well as the interpretation

So, if you know data[0] exist at memory location 0x0F10, where is the second point?

You might be tempted to say that if data[0] exists at memory address 0x0F10 then the next data point is at 0x0F11
But, you need to know
1. The number of bytes used for each element in the array
2. How many bytes forward in memory are we looking when incrementing the address by one
  An archetecture might define an increment in an address as a word increment, where a word might be 2,4, or more bytes
Assuming byte-addressable memory and that changing address by one moves to the next byte in memory.
I need to know the size of each piece of data.
1. If it is one byte each I’ll look for data [1] in 0x0F11.
2. If each data point was stored using 2 bytes, I’ll look in 0x0F12

So, how is the size of the base data type known by the function?

I want you to think about writing assembly here. There are several options.

One option is that the array is passed to a function as two pieces of information at RUN TIME:

starting address (addr)

size in bytes per data point (size)
At the level of assembly pseudo-code, assuming parameters are passed on the stack…

        Pop size  from stack to a register Rs 
        Pop address from stack to register pair Z 
Loop:   LD Rd, Z 
        Do some processing 
        ST Z, Rd 
        ADD Rs to Z 
        Evaluate exit condition and potentially loop

Typically we are in a better situation, where the data type and therefore size is determined at COMPILE TIME.
Dynamic Element Size:

        Pop size from stack to Rs 
        Pop addr from stack to register pair Z 
Loop:   LD Rd, Z 
        Do processing using variable wordsize Rs 
        ST Z, Rd 
        Add Rs to Z 
        Evaluate exit condition potentially loop

Using a size:

 
        Pop addr from stack to register pair Z 
Loop:   LD Rd, Z 
        Do processing using hard-coded,compile-time wordsize SIZEOFTYPE 
        ST Z, Rd 
        ADDIW Z, SIZEOFTYPE  
        Evaluate exit condition and potentially loop

So if you know where the first point of data is in an array, you know where the second point is (address of first point + size of datatype) on so on for the rest of the array

What about the length of the array?

Length of Array

Remember, the end of the array must always be encoded via some mechanism or convention:

another variable (e.g. storing length)
a termination value in the array
a predefined array length
etc…

Other reaons to use pointers:

So, we’ve discussed one reason to use pointers: sharing large sets of data with functions without unnecessary copying
(we discussed arrays, later we will discuss other large data structures like structs, lists, trees etc…)

Later we will use pointers to keep track of and refer to dynamically allocated (run-time) arrays and structures as opposed to arrays and variables that are defined at compile time and can have there location known at compile-time.

Syntax of pointers in C

Now we need some syntax in C to provide away to code all of this. Lets look at some function prototypes and what parameters are pushed on the stack (assuming parameters passed on the stack)

int myFunction1(int data); Single data value is pushed
int myFunction2(int data[]); A single address is pushed

The alternative syntax for the second is

int myFunction(int * data); A single address is pushed
indicates that an address is being passed and that data is a pointer variable
- The preceding type provided (int) tells the compiler how to treat the data accessed using that pointer variable and how to index into an array using that variable (I will refer to this as the pointee type)

New Types for Variable: Pointer

So, we have a new variable type (int *).

Example:

int count;   
int * ptrCount;

We can “point to”/reference/“store address of” count like this:
ampersand prefix means return a pointer to, or “address of”

ptrCount  = & count;

We can use ptrCount as a regular int variable through dereferencing.
* is the dereference operator

(*ptrCount) = (*ptrCount) + 1; // modifies count

Pointer variables can be used used to modify other variables

A pointer variable can be used to modify other variables. It can also be passed to functions to do the same.

Here is an example:

void SameMax( int * ptrA, int * ptrB){ 
  if (*ptrA  >  *ptrB){ 
    *ptrB = *ptrA; 
  } else {  
    *ptrA = *ptrB; 
}

…somewhere in main…

int a = 1, b=2; 
int * ptrInt; 

ptrInt = & a;

Now we can pass the pointer

SameMax(ptrInt, &b);

SameMax(&a, &b);

Both result in a=2, b=2. (&a creates a pointer)

What is a pointers

In a generic sense, a “pointer” tells us where something can be found.
- Internet Analogy
  - fully-qualified domain name, is like the name of the pointer variable
  - ip address tells us how to find it
In programming, a pointer variable contains the memory address of
- An basic, inbuilt typed variable
- An array
- Struct/Union (overed later)
- Dynamically allocated memory (covered in later lectures)

Why pointers?

They allow writing generic code that performs in-place operations on data without finalizing which data (the location) until run time
They allow you to refer to large data structures in a compact way
They facilitate sharing between different parts of programs
They make it possible to get new memory dynamically as your program is running
They make it easy to represent relationships among data items.

Caution with Pointers

They are a powerful low-level feature
Undisciplined use can be confusing and thus the source of subtle, hard-to-find bugs.
- Program crashes
- Memory leaks
- Unpredictable results

Java Reference vs. C Pointers (reference slide, not covered)

In Java, a reference variable is used to give a name to an object. The reference variable contains the memory address at which the object is stored.

Truck ford = new Truck( );

Since C does not support objects, it does not provide reference variables. However, C provides pointer variables which contain the memory address of another variable (sometimes called the “pointee”). A pointer may point to any kind of variable.

C Pointer Variables

To declare a pointer variable, we must do two things

Use the “*” (star) character to indicate that the variable being defined is a pointer type.
Indicate the type of variable to which the pointer will point (the pointee). This is necessary because C provides operations on pointers (e.g., *, ++, etc) whose meaning depends on the type of the pointee.
General syntax for declaration of a pointer:
type *nameOfPointer;

Pointer Declaration

Example: int * ptrInt;
- declares the variable ptrInt to be a pointer to a variable of type int. ptrInt will contain the memory address of some int variable or int array.
- Read this declaration as
  - “ptrInt is a pointer to an int”
  - Also, informal statement: “star ptrInt is an int”
Caution – Be careful when defining multiple variables on the same line.
- In this declaration
  int *ptrInt, ptrInt2;
  ptrInt is a pointer to an int, but ptrInt2 is NOT A POINTER!

Pointer Operators

The two primary operators used with pointers are
- * (star)
  and
- & (ampersand)
The * operator is used to declare pointer variables and to deference a pointer. “Dereferencing” a pointer means to use the value of the pointee.
The & operator gives the address of a variable.
- Recall the use of & in scanf( ), it was required to pass the location of the variable to the function so that it could be modified in-place. Simply passing a copy of the value would not facilitate this.

Pointer Examples

int x = 1, y = 2, z[10];

int *ip;        /* ip is a pointer to an int */

ip = &x;        /* ip points to (stores the memory address of) x */

y = *ip;        /* y is now 1, indirectly copied from x using ip */

*ip = 0;        /* x is now 0 */ 

ip = &z[5];    /* ip now points to z[5] */

When in doubt about order of operations, or to increase readability and conveyance of intent, use parenthesis

ip = &(z[5]);

If ip points to x, then *ip can be used anywhere x can be used so for the above example, the following are equivalent:

*ip = *ip + 10;
and
*x = x + 10;

The unary operators (one operand) * and & are higher precedence than binary arithmetic operators (two operands). So y = *ip + 1; takes the value of the variable to which ip points, adds 1 and assigns it to y

The statements *ip += 1; and ++*ip; and (*ip)++; each increment the variable to which ip points.

(Note that the parenthesis are necessary in the last statement; without them, the expression would increment ip rather than what it points to since unary operators * and ++ associate from right to left.)

Pointer and Variable types (Objective Type)

I will now introduce a terminology for this class referring to the type being pointed to or the type of an array. I will call this the “objective type” to since “pointee type” is difficult to grammatically distinguish from “type of pointee”

The objective type of a pointer and the type of its pointee must match in assignments.

int a = 42; 
int *ip; 
double d = 6.34; 
double *dp; 
ip = &a;    /* ok -- types match */ 
dp = &d;    /* ok */ 
ip = &d; /* compiler error -- type mismatch */ 
dp = &a; /* compiler error */

More Pointer Code

Use ampersand ( & ) to obtain the address of the pointee
Use star ( * ) to get / change the value of the pointee
Use %p to print the value of a pointer with printf( )

What is the output from this code?

int a = 1, *ptr1; 

/* show value and address of a  

** and value of the pointer */ 

ptr1 = &a ; 

printf("a = %d, &a = %p, ptr1 = %p, *ptr1 = %d\n", a, &a, ptr1, *ptr1) ; 

/* change the value of a by dereferencing ptr1 

** then print again */ 

*ptr1 = 35 ; 

printf("a = %d, &a = %p, ptr1 = %p, *ptr1 = %d\n", a,         &a, ptr1, *ptr1) ;

NULL

NULL is a special value which may be assigned to a pointer
NULL indicates that this pointer does not point to any variable (there is no pointee)
Often used when pointers are declared
```
int *pInt = NULL; 
```

Often used as the return type of functions that return a pointer to indicate function failure

int *myPtr = myFunction( ); 

if (myPtr == NULL){ 

/* something bad happened */ 

}

A NULL pointer is a pointer with the NULL value.
- Dereferencing a pointer whose value is NULL will result in program termination – at least in unix
- Typically used to indicate that the pointer doesn’t point to anything now
  - As in
    - Not initialized yet, not ready for dereferencing
    - Record was deleted and pointer doesn’t point to anything anymore
    - Query return no result, no record to point to
    - end of array, e.g. NULL-terminated pointer array (like a NULL character in a string)

Pointers and Function Arguments

Since C passes all function arguments “by value” there is no direct way for a function to alter a variable in the calling code.

Incorrect Swap

This version of the swap function doesn’t work. WHY NOT?

/* calling swap from somewhere in main() */ 

int x = 42, y = 17; 

Swap( x, y ); 

/* wrong version of swap */ 

void Swap (int a, int b) 

{ 

int temp; 

temp = a; 

a = b; 

b = temp; 

}

A corrected swap( )

The desired effect can be obtained by passing pointers to the values to be exchanged.

This is a very common use of pointers.

/* calling swap from somewhere in main( ) */ 
int x = 42, y = 17; 
Swap( &x, &y ); 
/* correct version of swap */ 
void Swap (int *px, int *py) 
{ 
int temp; 
temp = *px; 
*px = *py; 
*py = temp; 
}

More Pointer Function Parameters

Passing the address of variable(s) to a function can be used to have a function “return” multiple values.
The pointer arguments point to variables in the calling code which are changed (“returned”) by the function.

ConvertTime.c

void ConvertTime (int time, int *pHours, int *pMins)  
{ 
  *pHours = time / 60; 
  *pMins = time % 60; 
}  
int main( ) 
{ 
  int time, hours, minutes; 
  printf("Enter a time duration in minutes: "); 
  scanf ("%d", &time); 
  ConvertTime (time, &hours, &minutes); 
  printf("HH:MM format: %d:%02d\n", hours, minutes); 
  return 0; 
}

An Exercise