C TUTORIAL CODE: LINKED LIST THROUGH C LANGUAGE

INTRODUCTION Linked list is a chain of structs or records called nodes. Each node has at least two members, one of which points to the next item or node in the list! These are defined as Single Linked Lists because they only point to the next item, and not the previous. Those that do point to both are called Doubly Linked Lists or Circular Linked Lists.

Linked lists are among the simplest and most common data structures, and are used to implement many important abstract datastructures, such as stacks, queues, hash tables, symbolic expressions, skip lists, and many more.

The principal benefit of a linked list over a conventional array is that the order of the linked items may be different from the order that the data items are stored in memory or on disk. For that reason, linked lists allow insertion and removal of nodes at any point in the list, with a constant number of operations.

Linked List Types: Node and Pointer

Before writing the code to build the above list, we need two data types...

• Node The type for the nodes which will make up the body of the list.

These are allocated in the heap. Each node contains a single client data

element and a pointer to the next node in the list. Type: struct node

struct node {

int data;

struct node* next;

};

• Node Pointer The type for pointers to nodes. This will be the type of the

head pointer and the .next fields inside each node. In C and C++, no

separate type declaration is required since the pointer type is just the node

type followed by a '*'. Type: struct node*

BuildOneTwoThree() Function

Here is simple function which uses pointer operations to build the list {1, 2, 3}. The

memory drawing above corresponds to the state of memory at the end of this function.

This function demonstrates how calls to malloc() and pointer assignments (=) work to

build a pointer structure in the heap.

Build the list {1, 2, 3} in the heap and store

its head pointer in a local stack variable.

Returns the head pointer to the caller.

struct node* BuildOneTwoThree() {

struct node* head = NULL;

struct node* second = NULL;

struct node* third = NULL;

head = malloc(sizeof(struct node)); // allocate 3 nodes in the heap

second = malloc(sizeof(struct node));

third = malloc(sizeof(struct node));

head->data = 1; // setup first node

head->next = second; // note: pointer assignment rule

second->data = 2; // setup second node

second->next = third;

third->data = 3; // setup third link

third->next = NULL;

// At this point, the linked list referenced by "head"

// matches the list in the drawing.

return head;

}

Basic concepts and nomenclature

Each record of a linked list is often called an element or node.

The field of each node that contains address of the next node is usually called the next link or next pointer. The remaining fields may be called the data, information, value, or payload fields.

The head of a list is its first node, and the tail is the list minus that node (or a pointer thereto). In Lisp and some derived languages, the tail may be called the CDR (pronounced could-R) of the list, while the payload of the head node may be called the

Linear and circular lists

In the last node of a list, the link field often contains a null reference, a special value that is interpreted by programs as meaning "there is no such node". A less common convention is to make it point to the first node of the list; in that case the list is said to be circular or circularly linked; otherwise it is said to be open or linear.

Simply-, doubly-, and multiply-linked lists

In a doubly-linked list, each node contains, besides the next-node link, a second link field pointing to the previous node in the sequence. The two links may be called forward(s) and backwards. Linked lists that lack such pointers are said to be simply linked, or simple linked lists.

A doubly-linked list whose nodes contain three fields: an integer value, the link forward to the next node, and the link backward to the previous node

The technique known as XOR-linking allows a doubly-linked list to be implemented using a single link field in each node. However, this technique requires the ability to do bit operations on addresses, and therefore may not be available in some high-level languages.

In a multiply-linked list, each node contains two or more link fields, each field being used to connect the same set of data records in a different order (e.g., by name, by department, by date of birth, etc.). (While doubly-linked lists can be seen as special cases of multiply-linked list, the fact that the two orders are opposite to each other leads to simpler and more efficient algorithms, so they are usually treated as a separate case.)

Linked lists vs. arrays

	Array	Linked list
Indexing	Θ(1)	Θ(n)
Inserting / Deleting at end	Θ(1)	Θ(1) or Θ(n)
Inserting / Deleting in middle (with iterator)	Θ(n)	Θ(1)
Persistent	No	Singly yes
Locality	Great	Poor

Linked lists have several advantages over arrays. Insertion of an element at a specific point of a list is a constant-time operation, whereas insertion in an array may require moving half of the elements, or more. While one can "delete" an element from an array in constant time by somehow marking its slot as "vacant", an algorithm that iterates over the elements may have to skip a large number of vacant slots.

Simply-linked linear lists vs. other lists

While doubly-linked and/or circular lists have advantages over simply-linked linear lists, linear lists offer some advantages that make them preferable in some situations.

For one thing, a simply-linked linear list is a recursive data structure, because it contains a pointer to a smaller object of the same type. For that reason, many operations on simply-linked linear lists (such as merging two lists, or enumerating the elements in reverse order) often have very simple recursive algorithms, much simpler than any solution using iterative commands. While one can adapt those recursive solutions for doubly-linked and circularly-linked lists, the procedures generally need extra arguments and more complicated base cases.

Doubly-linked vs. singly-linked

Double-linked lists require more space per node (unless one uses xor-linking), and their elementary operations are more expensive; but they are often easier to manipulate because they allow sequential access to the list in both directions. In particular, one can insert or delete a node in a constant number of operations given only that node's address. To do the same in a singly-linked list, one must have the previous node's address. Some algorithms require access in both directions. On the other hand, they do not allow tail-sharing, and cannot be used as persistent data structures.

Circularly-linked vs. linearly-linked

A circularly linked list may be a natural option to represent arrays that are naturally circular, e.g. for the corners of a polygon, for a pool of buffers that are used and released in FIFO order, or for a set of processes that should be time-shared in round-robin order. In these applications, a pointer to any node serves as a handle to the whole list.

With a circular list, a pointer to the last node gives easy access also to the first node, by following one link. Thus, in applications that require access to both ends of the list (e.g., in the implementation of a queue), a circular structure allows one to handle the sructure by a single pointer, instead of two.

Using sentinel nodes

Sentinel node may simplify certain list operations, by ensuring that the next and/or previous nodes exist for every element, and that even empty lists have at least one node. One may also use a sentinel node at the end of the list, with an appropriate data field, to eliminate some end-of-list tests. For example, when scanning the list looking for a node with a given value x, setting the sentinel's data field to x makes it unnecessary to test for end-of-list inside the loop. Another example is the merging two sorted lists: if their sentinels have data fields set to +∞, the choice of the next output node does not need special handling for empty lists.

However, sentinel nodes use up extra space (especially in applications that use many short lists), and they may complicate other operations (such as the creation of a new empty list).

However, if the circular list is used merely to simulate a linear list, one may avoid some of this complexity by adding a single sentinel node to every list, between the last and the first data nodes. With this convention, an empty list consists of the sentinel node alone, pointing to itself via the next-node link. The list handle should then be a pointer to the last data node, before the sentinel, if the list is not empty; or to the sentinel itself, if the list is empty.

The same trick can be used to simplify the handling of a doubly-linked linear list, by turning it into a circular doubly-linked list with a single sentinel node. However, in this case, the handle should be a single pointer to the dummy node itself.

Linked list operations

When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.

Linearly-linked lists

Singly-linked lists

Our node data structure will have two fields. We also keep a variable firstNode which always points to the first node in the list, or is null for an empty list.

 record Node {

    data // The data being stored in the node

    next // A reference to the next node, null for last node

 record List {

     Node firstNode   // points to first node of list; null for empty list

Traversal of a singly-linked list is simple, beginning at the first node and following each next link until we come to the end:

 node := list.firstNode

 while node not null {

     (do something with node.data)

     node := node.next

The following code inserts a node after an existing node in a singly linked list. The diagram shows how it works. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.

 function insertAfter(Node node, Node newNode) { // insert newNode after node

     newNode.next := node.next

     node.next    := newNode

Inserting at the beginning of the list requires a separate function. This requires updating firstNode.

 function insertBeginning(List list, Node newNode) { // insert node before current first node

     newNode.next   := list.firstNode

     list.firstNode := newNode

Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. The diagram demonstrates the former. To find and remove a particular node, one must again keep track of the previous element.

 function removeAfter(node node) { // remove node past this one

     obsoleteNode := node.next

     node.next := node.next.next

     destroy obsoleteNode

 function removeBeginning(List list) { // remove first node

     obsoleteNode := list.firstNode

     list.firstNode := list.firstNode.next          // point past deleted node

     destroy obsoleteNode

Notice that removeBeginning() sets list.firstNode to null when removing the last node in the list.

Since we can't iterate backwards, efficient "insertBefore" or "removeBefore" operations are not possible.

Appending one linked list to another can be inefficient unless a reference to the tail is kept as part of the List structure, because we must traverse the entire first list in order to find the tail, and then append the second list to this. Thus, if two linearly-linked lists are each of length n, list appending has asymptotic time complexity of O(n). In the Lisp family of languages, list appending is provided by the append procedure.

Many of the special cases of linked list operations can be eliminated by including a dummy element at the front of the list. This ensures that there are no special cases for the beginning of the list and renders both insertBeginning() and removeBeginning() unnecessary. In this case, the first useful data in the list will be found at list.firstNode.next.

Circularly-linked list

In a circularly linked list, all nodes are linked in a continuous circle, without using null. For lists with a front and a back (such as a queue), one stores a reference to the last node in the list. The next node after the last node is the first node. Elements can be added to the back of the list and removed from the front in constant time.

Circularly-linked lists can be either singly or doubly linked.

Both types of circularly-linked lists benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing firstNode and lastNode, although if the list may be empty we need a special representation for the empty list, such as a lastNode variable which points to some node in the list or is null if it's empty; we use such a lastNode here. This representation significantly simplifies adding and removing nodes with a non-empty list, but empty lists are then a special case.

Assuming that someNode is some node in a non-empty circular simply-linked list, this code iterates through that list starting with someNode:

 function iterate(someNode)

   if someNode  ≠ null

     node := someNode

do

       do something with node.value

       node := node.next

     while node ≠ someNode

Notice that the test "while node ≠ someNode" must be at the end of the loop. If it were replaced by the test "" at the beginning of the loop, the procedure would fail whenever the list had only one node.

This function inserts a node "newNode" into a circular linked list after a given node "node". If "node" is null, it assumes that the list is empty.

 function insertAfter(Node node, Node newNode)

     if node = null

       newNode.next := newNode

     else

       newNode.next := node.next

       node.next := newNode

Suppose that "L" is a variable pointing to the last node of a circular linked list (or null if the list is empty). To append "newNode" to the end of the list, one may do

 insertAfter(L, newNode)

 L = newNode

To insert "newNode" at the beginning of the list, one may do

 insertAfter(L, newNode)

 if L = null

   L = newNode

Linked lists using arrays of nodes

Languages that do not support any type of reference can still create links by replacing pointers with array indices. The approach is to keep an array of records, where each record has integer fields indicating the index of the next (and possibly previous) node in the array. Not all nodes in the array need be used. If records are not supported as well, parallel arrays can often be used instead.

As an example, consider the following linked list record that uses arrays instead of pointers:

 record Entry {

    integer next; // index of next entry in array

    integer prev; // previous entry (if double-linked)

    string name;

    real balance;

By creating an array of these structures, and an integer variable to store the index of the first element, a linked list can be built:

integer listHead;

Entry Records[1000];

Links between elements are formed by placing the array index of the next (or previous) cell into the Next or Prev field within a given element. For example:

Index	Next	Prev	Name	Balance
0	1	4	Jones, John	123.45
1	-1	0	Smith, Joseph	234.56
2 (listHead)	4	-1	Adams, Adam	0.00
3			Ignore, Ignatius	999.99
4	0	2	Another, Anita	876.54
5
6
7

In the above example, ListHead would be set to 2, the location of the first entry in the list. Notice that entry 3 and 5 through 7 are not part of the list. These cells are available for any additions to the list. By creating a ListFree integer variable, a free list could be created to keep track of what cells are available. If all entries are in use, the size of the array would have to be increased or some elements would have to be deleted before new entries could be stored in the list.

The following code would traverse the list and display names and account balance:

i := listHead;

while i >= 0 { '// loop through the list

     print i, Records[i].name, Records[i].balance // print entry

     i = Records[i].next;

When faced with a choice, the advantages of this approach include:

The linked list is relocatable, meaning it can be moved about in memory at will, and it can also be quickly and directly serialized for storage on disk or transfer over a network.
Especially for a small list, array indexes can occupy significantly less space than a full pointer on many architectures.
Locality of reference can be improved by keeping the nodes together in memory and by periodically rearranging them, although this can also be done in a general store.
Naïve dynamic memory allocators can produce an excessive amount of overhead storage for each node allocated; almost no allocation overhead is incurred per node in this approach.
Seizing an entry from a pre-allocated array is faster than using dynamic memory allocation for each node, since dynamic memory allocation typically requires a search for a free memory block of the desired size.

This approach has one main disadvantage, however: it creates and manages a private memory space for its nodes. This leads to the following issues:

It increase complexity of the implementation.
Growing a large array when it is full may be difficult or impossible, whereas finding space for a new linked list node in a large, general memory pool may be easier.
Adding elements to a dynamic array will occasionally (when it is full) unexpectedly take linear (O (n)) instead of constant time (although it's still an amortized constant).
Using a general memory pool leaves more memory for other data if the list is smaller than expected or if many nodes are freed.

For these reasons, this approach is mainly used for languages that do not support dynamic memory allocation. These disadvantages are also mitigated if the maximum size of the list is known at the time the array is created.

Language support

Many programming languages such as Lisp and Scheme have singly linked lists built in. In many functional languages, these lists are constructed from nodes, each called a cons or cons cell. The cons has two fields: the car, a reference to the data for that node, and the cdr, a reference to the next node. Although cons cells can be used to build other data structures, this is their primary purpose.

In languages that support Abstract data types or templates, linked list ADTs or templates are available for building linked lists. In other languages, linked lists are typically built using references together with records.

Here is a complete example in C:

#include <stdio.h>    /* for printf */

#include <stdlib.h>   /* for malloc */

struct node

      int data;

      struct node *next; /* pointer to next element in list */

};

struct node *list_add(struct node **p, int i)

      struct node *n = malloc(sizeof(struct node));

      if (n == NULL)

            return NULL;

      n->next = *p; /* the previous element (*p) now becomes the "next" element */

      *p = n;       /* add new empty element to the front (head) of the list */

      n->data = i;

      return *p;

void list_remove(struct node **p) /* remove head */

      if (*p != NULL)

            struct node *n = *p;

            *p = (*p)->next;

            free(n);

struct node **list_search(struct node **n, int i)

      while (*n != NULL)

            if ((*n)->data == i)

                  return n;

            n = &(*n)->next;

      return NULL;

void list_print(struct node *n)

      if (n == NULL)

            printf("list is empty\n");

      while (n != NULL)

            printf("print %p %p %d\n", n, n->next, n->data);

            n = n->next;

int main(void)

      struct node *n = NULL;

      list_add(&n, 0); /* list: 0 */

      list_add(&n, 1); /* list: 1 0 */

      list_add(&n, 2); /* list: 2 1 0 */

      list_add(&n, 3); /* list: 3 2 1 0 */

      list_add(&n, 4); /* list: 4 3 2 1 0 */

      list_print(n);

      list_remove(&n);                 /* remove first (4) */

      list_remove(&n->next);           /* remove new second (2) */

      list_remove(list_search(&n, 1)); /* remove cell containing 1 (first) */

      list_remove(&n->next);           /* remove second to last node (0) */

      list_remove(&n);                 /* remove last (3) */

      list_print(n);

      return 0;

--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------

Pages

Saturday, 27 July 2013

LINKED LIST THROUGH C LANGUAGE

Basic concepts and nomenclature

Linear and circular lists

Linked list operations

No comments:

Post a Comment