C: While loops for input

c go java js lua py vbs rkt scm pro

Many problems fall in to the following general scheme.

Input: Read the input data.
Process: Process the data, computing intermediate results along the way.
Output: Write the output results in a meaningful form.

Many loops that process a set of items can be expressed by the following pseudo-code. A general form of pseudo-code for this problem is as follows.

FOR EACH

DO

END FOR

Note: The use of FOR EACH does not imply a for loop. In many cases, a while loop would be preferred.

A more specific form of the pseudo-code is as follows.

WHILE

DO

END WHILE

The pseudo-code for the FOR EACH on the can be expressed by the WHILE loop. Since the pseudo-code for the FOR EACH loop is more concise, it is the better pseudo-code form. Not all WHILE loops, however, can be easily expressed in a FOR EACH form.

Consider the problem of reading nonnegative numbers from an input, one number per line, and printing the sum total of all of the numbers. So an input file input-01.dat containing the numbers

23 57 75

would result in the output of the number 155. The data items are assumed to be integer numbers and achieving the goal requires that a running sum be computed and stored. Since the concept of a sum is required, let us use the name sum to represent the running sum of the numbers read so far (this is an English description of the loop invariant).

The following pseudo-code solution results.

sum

FOR

EACH

x

DO

x

sum

END

sum

The name x has been used to indicate a data item. This name will eventually be represented as a variable in the program.

The pseudo-code does not indicate the order in which the items are to be processed. The items could be processed in the order in which they appear in the data, in random order, or in reverse order of appearance, just as long as they are all processed and each is only processed once. In terms of machine efficiency, however, the order can be crucially important. A programmer must pick an order. When the data items are initially in an input data file, the logical order is to process them in the order in which they appear in the input data file. Since this data must be stored somewhere, variables must be declared to hold the data in computer memory. If all or part of the input data stream can be handled in a similar manner, it is often wise to reuse variables than to declare a variable for each item in the input data stream. This is especially true when it is unknown at compile-time how many data items will be in the input data stream. Usually the data items are numbered (internally within the program) starting at 1. In many cases, explicit numbering is not needed; only the total number of data items must actually be stored.

When reading a sequence of data items from the input data stream, the natural question to ask is, "How will the program determine when the end of the input data has been reached?". There are several methods that are often used to determine the end of the data.

In a counter loop, the number of data items is known at compile-time.
In a header loop, the first data item represents the number of data items.
In a trailer loop, a special sentinel value guards the end of the data items.
In an end of file loop, the end of file marker is explicitly detected.

Which input data form is the preferred one? Well, it depends on what you want to do and what restrictions are in force. If the data format has already been specified, you must write your program to fit the data. If the program already exists, you must create your data to fit the program. If neither has been specified, you may design accordingly, keeping in mind that these decisions may effect others who will later use your program, or data. Advantages and disadvantages of each form of loop will be discussed in turn. We will now look at various implementations of the above pseudo-code. In each of the implementations to be presented, the following variable definitions apply.

dataPos1 represents the number of the current data item being processed.
dataLen1 represents the total number of data items.
dataValue1 represents the value of the current data item being processed.
dataSum1 represents the sum of the data items processed so far.

A variable definition for each variable, or group of similar variables, should be included as comments in the program. These definitions form the basis of a variable dictionary. The variable definitions have been removed in the following programs to save space. Also, note that these programs do not all use all of the above variables.

When the number of data items is known at compile time, no loop is really necessary. Here is a program that just reads the three data values.

Here is the C code.

#include <stdio.h>

int main() {
	// declarations
	int dataValue1;
	int dataValue2;
	int dataValue3;
	int dataSum1;
	// input
	scanf("%d",&dataValue1);
	scanf("%d",&dataValue2);
	scanf("%d",&dataValue3);
	// process
	dataSum1 = dataValue1 + dataValue2 + dataValue3;
	// output
	printf("The sum of 3 values is %d.\n",dataSum1);
	return 0;
	}

Here are some examples of input and output for the above program code.

Here is an example input.

23
57
75

For the above example input, here is the expected output.

The sum of 3 values is 155.

This program, however, has a lot of unnecessary repetition. (See the next program to see how this can be addressed).

When the number of data items is known at compile-time, a counter loop can be used. For example, the three data items can be read with three input statements as follows. Here is the C code.

#include <stdio.h>

int main() {
	int dataPos1;
	int dataLen1;
	int dataValue1;
	int dataSum1;

	dataPos1 = 0;
	dataSum1 = 0;
	printf("Data values:\n");
	printf("\n");

	// set counter value
	dataLen1 = 3;
	while (dataPos1 != dataLen1) {
		scanf("%d",&dataValue1);
		dataPos1 += 1;
		printf("\t%d. %d\n",dataPos1,dataValue1);
		dataSum1 += dataValue1;
		}

	printf("\n");
	printf("The sum of %d values is %d.\n",dataLen1,dataSum1);
	return 0;
	}

Here are some examples of input and output for the above program code.

Here is an example input.

23
57
75

For the above example input, here is the expected output.

Data values:

	1. 23
	2. 57
	3. 75

The sum of 3 values is 155.

This solution, however, does not scale up. That is, it would be impractical to use this solution for a large number of data items. Using a loop for input, the program might appear as follows.

A header loop requires that the count of the number of items to be read precede the data, as in the following data set.

3 23 57 75

If the number of data items precedes the data, a header loop can be used. A refined pseudo-code might be as follows.

n

FOR EACH

n

DO

END FOR

A pseudo-code solution should be as general as possible, so for most purposes, the original pseudo-code would be preferable to the header pseudo-code. An implementation might be as follows.

Usually the number of data items must be known later in the program, so count up from zero and not down by destroying the variable containing the count of data values. Here is the C code.

#include <stdio.h>

int main() {
	int dataPos1;
	int dataLen1;
	int dataValue1;
	int dataSum1;

	dataPos1 = 0;
	dataSum1 = 0;
	printf("Data values:\n");
	printf("\n");

	// get header value
	scanf("%d",&dataLen1);
	while (dataPos1 != dataLen1) {
		scanf("%d",&dataValue1);
		dataPos1 += 1;
		printf("\t%d. %d\n",dataPos1,dataValue1);
		dataSum1 += dataValue1;
		}

	printf("\n");
	printf("The sum of %d values is %d.\n",dataLen1,dataSum1);
	return 0;
	}

Here are some examples of input and output for the above program code.

Here is an example input.

For the above example input, here is the expected output.

Data values:

	1. 23
	2. 57
	3. 75

The sum of 3 values is 155.

The header must be a number, although the data items can be of other forms. A header loop is simple to implement, but it is a tedious and error-prone task for humans to update the count. Header loops are often used when the computer is doing the counting.

If the data ends with a special value, then a trailer loop can be used. This special value is called a sentinel, since the value guards, or marks, the end of the data.

A trailer loop requires that a special value, not used in the actual data, mark the end of the data, as in the following data values.

23 57 75 -1

The pseudo-code for a trailer loop may be expanded as follows.

item

WHILE

item

DO

item

END

Notice that item must always be checked to insure that it is not the sentinel before processing the data item. A pseudo-code solution should be as general as possible, so for most purposes, the original pseudo-code would be preferable to the trailer pseudo-code. The pseudo-code might be implemented as follows. Here is the C code.

#include <stdio.h>

int main() {
	int dataPos1;
	int dataLen1;
	int dataValue1;
	int dataSum1;

	dataPos1 = 0;
	dataSum1 = 0;
	printf("Data values:\n");
	printf("\n");

	// get trailer or data value
	scanf("{dataValue1}");
	dataLen1 = 0;
	while (dataValue1 >= 0) {
		dataLen1 += 1;
		dataPos1 += 1;
		printf("\t%d. %d\n",dataPos1,dataValue1);
		dataSum1 += dataValue1;
		// get trailer or data value
		scanf("%d",&dataValue1);
		}

	printf("\n");
	printf("The sum of %d values is %d.\n",dataLen1,dataSum1);
	return 0;
	}

Here are some examples of input and output for the above program code.

Here is an example input.

For the above example input, here is the expected output.

Data values:


The sum of 0 values is 0.

Two restrictions apply to the sentinel value used to guard the end of the data.

The sentinel value cannot be a legal value for the data. Why?
The sentinel value must be of the same type as the data values. Why?

Note that the sentinel value need not be a single value, but can be a set of values.

The trailer data format is easier for humans to update than the header loop format. Why?

An endfile loop explicitly detects the end of the input file, as in the following data file, input-04.

23 57 75

The standard input file is stdin.

The function to detect the end of the input is feof.

When typing input into a program, the following keys simulate the end of file for stdin.

Windows: Type Ctrl-Z and press Enter.
Linux: Type Ctrl-D.

For more information, see Pipes and redirection . The pseudo-code for an endfile loop is as follows.

WHILE

DO

END

As such, there can only be one endfile loop per input file.

An endfile loop might be implemented as follows. Here is the C code.

#include <stdio.h>

int main() {
	int dataPos1;
	int dataLen1;
	int dataValue1;
	int dataSum1;

	dataPos1 = 0;
	dataSum1 = 0;
	printf("Data values:\n");
	printf("\n");

	dataLen1 = 0;
	while (! feof(stdin)) {
		scanf("%d",&dataValue1);
		dataLen1 += 1;
		dataPos1 += 1;
		printf("\t%d. %d\n",dataPos1,dataValue1);
		dataSum1 += dataValue1;
		}

	printf("\n");
	printf("The sum of %d values is %d.\n",dataLen1,dataSum1);
	return 0;
	}

Here are some examples of input and output for the above program code.

Here is an example input.

23
57
75

For the above example input, here is the expected output.

Data values:

	1. 23
	2. 57
	3. 75

The sum of 3 values is 155.

Compare the following input data formats.

Counter loop format:

23 57 75

Header loop format:

3 23 57 75

Trailer loop format in input-03.dat:

23 57 75 -1

Endfile loop format in input-04.dat:

23 57 75

Each of these methods has their own particular input data form, advantages, disadvantages, and implementation considerations. In practice, combinations of these methods are used in the same program and input data stream. What are the advantages and disadvantages of each?

A loop is a robust loop if it gracefully handles the degenerate case

of no input data. The following are the input data file format for each of the input loop forms.

Counter loop: (empty file)

Header loop:

0

Trailer loop:

-1

Endfile loop: (empty file)

4