Since Christmas Period Tinkering, I've been attending C++ classes and we've been getting various tasks to do. Last week we were told to take a normal string and print some stats about it. For example, we needed to count the number of words in the string, the occurrences of characters in the string and count how many upper and lower case characters there were. We could only use basic C arrays, and you cannot modify the string. No STL.

I thought this was a trivial problem, having previously achieved this with C# with ease through using library functions like String.Split(), String.Trim() etc. so I thought this would be relatively trivial. 

I was surprised at how long it took me to count the number of words in a string! There are string variances that need to be accommodated, i.e exceptions. For example, the string "The quick brown fox jumps over the lazy dog" has 9 words, and if you count the number of spaces, which is arguably the first thing most people would do, you soon encounter problems with variations of this string such as when there are leading or trailing spaces. Once this is solved, extra spaces in between words cause problems. Soon it becomes clear that counting spaces are not worth it. 

One way around this problem is to iterate through each character in the string and determine if the character is part of a word or not. It pretty much solves the leading and trailing issue and the original string is not modified:


#include <cmath>
#include <iostream>

using std :: cout;
using std :: cin;
using std :: endl;

int main(int argc, char* argv[])
{
	char myText[] = "The quick brown fox jumps over the lazy dog";
	cout << myText << endl;
	
	// 1. count characters not incl null
	int count_items = sizeof(myText) / sizeof(myText[0]);	
	int count_chars = count_items - 1;

	int count_upper = 0;
	int count_lower = 0;		
	int word_count = 0;
	bool in_word = false;
	
	const int MAX_CHARACTERS = 128; 
	int characters_counts[MAX_CHARACTERS] = {0};

	for(int c = 0; c <= count_chars-1; c++)
	{
		// Not in a word
		if(isspace(myText[c])) 
		{			
			in_word = false;
		}
		else
		{				
			 // if was not previously in a word, then we are in a new word, increase word count
			if(!in_word)
			{				
				word_count++;
			}			
			in_word = true; // In a word
		}

		// 2. identify if character is upper csase or lowercase (functions)
		
		if(isupper(myText[c]))
		{			
			count_upper++;		
		}		

		if(islower(myText[c]))
		{		
			count_lower++;
		}	

		// 3. count number given characters eg space in the string []
		characters_counts[myText[c]]++;
	}

	cout << endl;
	cout << "Total character count: " << count_chars << endl;
	cout << "Uppercase count: " << count_upper << endl;
	cout << "Lowercase count: " << count_lower << endl;

	// 4. count the number of words in the string
	cout << "Word count: " << word_count << endl;

	for(int i = 0 ; i <= MAX_CHARACTERS-1; i++)
	{
		int count = characters_counts[i];
		if(count > 0)
		{
			cout << "'" << (char) i << "' --> ";
			cout << std::dec << count;
			cout << " " << endl;
		}
	}

}

Of course, there are limitations. For example, one is that this approach still bases a large portion of its logic on using spaces as word separators, so it doesn't consider punctuations as word separators. It only prints the ASCII characters that you use and probably countless other problems which isn't the point. 

On reflection, this is what practise is and why it's important. It's not about the triviality of the problem, it's the dynamic affordances that are provided by encountering problems.

Besides this, I set myself an arguably more interesting task recently, which was to figure out how to generate all the permutations of a series of digits. For example, if you had a combination lock with 6 digits, generate all the possible permutations.

To generate the Tth last digit for the nth permutation, this general formula can be used:

\[\lceil{\frac{n}{p^{t-1}}}\rceil - P\lceil{\frac{n}{p^t}}\rceil -P\]

where t is relative to the last digit to generate, i.e t=1 is the last digit, t=2 is the penultimate etc, and P is the number of possible values that this digit can have.

For example a 3-digit code where each digit can be A-D, ie 4 possibly sumbols per digit (p=4), would allow you to generate the 12th (P=12) permutation like this:

\[ \lceil{\frac{12}{4^{2}}}\rceil - 4\lceil{\frac{12}{4^3}}\rceil -4, \lceil{\frac{12}{4^{1}}}\rceil - 4\lceil{\frac{12}{4^2}}\rceil -4, \lceil{\frac{12}{4^{0}}}\rceil - P\lceil{\frac{12}{4^1}}\rceil -4 \]

This would result in the T=3, 2, 1 mapping to digit values: 1, 3, 4 which map to 1= A, 2=B, 3=C and 4=D. So the 12th permutation is ACD.

I wrote a program the verify this:

#include <cmath>
#include <iostream>
#include "permeate.h"

typedef unsigned long long int SIZE;

using std :: cout;
using std :: cin;
using std :: endl;

void print_n_permeation(SIZE n, int p, int num_terms, char* possibilities)
{
	cout << "n=" << n << ") " ;
		for(int t = num_terms; t >= 1; t--)
		{
			int a = (ceil(n/pow(p,t-1)));
			int b = p * ceil(n/pow(p,t));
			int c = p;
			int result = a - (b -c);
			cout << " " << possibilities[result-1] << " ";
		}
		cout << endl;
}

void print_all_permeations(SIZE total_permeations, int p, int num_terms, char* possibilities)
{
	for (int n = 1; n <= total_permeations; n++)
	{
		print_n_permeation(n, p, num_terms, possibilities);
	}
}

int main(int argc, char* argv[])
{
	int num_terms;
	int p;
	cout << "Enter the number of digits:";
	cin >> num_terms;
	cout << "Enter the number of possibilities per digit:";
	cin >> p;

	char* possibilities = new char[p](); // dynamic array in c++s

	for(int i = 0;i < p;i++)
	{
		cout << "Enter possibility #" << i << ": ";
		cin >> possibilities[i];
	}

	SIZE total_permeations = pow(p, num_terms); // num_terms aka num digits
	cout << "Total permeations:" << total_permeations << endl;

	
	SIZE n; // specific permeation
	cout << "Enter permeation to find (0=ALL, -1=None): ";

	while(cin >> n)
	{
		if(n == 0)
			print_all_permeations(total_permeations, p, num_terms, possibilities);	
		else if (n == -1)
			break;
		else
		{
			if( n <= total_permeations)
				print_n_permeation(n, p, num_terms, possibilities);
			else
				cout << "Invalid, must be between 1 and " << total_permeations << endl;
		}
	} 
}


Pretty neat I thought, but there are other ways to do this without calculation, such as Heap's algorithm that swaps characters iteratively. Interesting non the less!