IBM Z and LinuxONE - Languages - Group home

C++ Templates: What's all this specialization about anyway?

  

I've worked out a plan so I can hopefully put a regular post on this blog. My plan is to talk about interesting things the compiler does with templates including lots of examples.


As a compiler developer, I tend to work with small examples that show or test how the compiler is behaving rather than examples that show how you might use a compiler feature in an application. Perhaps some will find this take on things interesting. I have many ideas for topics, but I'm also open to suggestions.


One of the things I find interesting is how the compiler handles overload resolution, and more specifically how partially specialized templates allow you to basically put classes through overload resolution. Let me explain with a series of examples.

struct t1{}; struct t2{}; struct t3{};void func(t1 arg){ printf("called t1\n"); }void func(t2 arg){ printf("called t2\n"); }void func(t3 arg){ printf("called t3\n"); }int main(void){t1 x1; t2 x2; t3 x3;func(x1);func(x2);func(x3);return 0;}

At runtime

called t1

called t2

called t3


This is pretty simple, the compiler can determine which function to call based on the arguments passed to the function. This is overload resolution. The compiler starts with a candidate list:

  1. void func(t1);
  2. void func(t2);
  3. void func(t3);

The compiler uses the arguments in each function call to select which candidate is the best match.


Now consider a template function:

#include <iostream>#include <typeinfo>struct t1{}; struct t2{}; struct t3{};using namespace std;template <class A, class B, class C> void func(A a1, B a2, C a3){   cout << "A: " << typeid(a1).name() << endl;   cout << "B: " << typeid(a2).name() << endl;   cout << "C: " << typeid(a3).name() << endl;}int main(void){  t1 x1; t2 x2; t3 x3;  func(x1,x2,x3);  return 0;}

At runtime

A: t1

B: t2

C: t3


In this example using a function template, the compiler has 1 candidate with 3 unknown types <A,B,C>. It takes the arguments to the call (x1,x2,x3) and the parameters to the function (A,B,C) and its pretty easy to see how it deduces the template arguments

A t1

B t2

C t3


The compiler instantiates the template function, filling in the arguments for the template parameters to create a real function:

void func(t1 a1, t2 a2, t3 a3)


Had there been additional overload candidates they would all be combined into overload resolution which would then pick the right one just like the non-template example.


Overload resolution allows you to create different versions of the same function that likely do something different based on the type passed in. The compiler uses the type information to select the right one. With the addition of template functions, you can reduce the number of overload candidates you need by defining functions in terms of parameterized types. The compiler can then pick the right template and create the overload candidate for you.


Now consider a template class:

#include <iostream>#include <typeinfo>using namespace std;struct t1{}; struct t2{}; struct t3{};template <class A, int I> struct container{   void callMe(){      cout << "primary A: " << typeid(A).name() << " I: " << I << endl;   }};int main(void){   container<t1,10> test;   test.callMe();   return 0;}

At runtime

primary A: t1 I: 10


The compiler doesn't have to do many tricks for such an example, there is only one class container. It takes the arguments <t1,10> and the parameters <A,I> and deduces that A is t1 and I is 10.


How about:

#include <iostream>#include <typeinfo>using namespace std;struct t1{}; struct t2{}; struct t3{};template <class A, int I> struct container{   void callMe(){      cout << "primary A: " << typeid(A).name() << " I: " << I << endl;   }};template <> struct container<t3,99>{   void callMe(){      cout << "complete specialization t3, 99" << endl;   }};int main(void){   container<t1,10> test1;   test1.callMe();   container<t3,99> test2;   test2.callMe();   return 0;}

At runtime

primary A: t1 I: 10

complete specialization t3, 99


In this case I have 2 versions of the template, one of which is completely specialized, meaning that all the parameters are specialized in this definition of the class. Specialized is just a fancy term meaning the parameters are no longer parameters, they have real values. I like to use the term complete specialization for these things as it makes more sense to me, but most C++ circles, including the standard, refer to this as an explicit specialization.


The compiler now has 2 versions of class container, the class is overloaded.

  1. template <class A, int I> struct container;
  2. template <> struct container<t3,99>;


When the compiler reaches container<t1,10> test1; with arguments <t1,10>:

For candidate 1 it can deduce <A=t1, I=10> so this is a viable candidate.

For candidate 2 there is nothing to deduce and no way that <t3,99> can match <t1,10> so this candidate is eliminated.

The compiler has only one candidate, which is chosen.


When the compiler reaches container<t3,99> test2; with arguments <t3,99>

For candidate 1 it can deduce <A=t3, I=99> so this is a viable candidate.

For candidate 2 it can determine that <t3,99> matches <t3,99> so this is a viable candidate.

In the case where it finds more than one viable candidate, the compiler always chooses the most specialized specialization, or candidate 2.


Here's the next example:

#include <iostream>#include <typeinfo>using namespace std;struct t1{}; struct t2{}; struct t3{};template <class A, int I> struct container{   void callMe(){      cout << "primary A: " << typeid(A).name() << " I: " << I << endl;   }};template <class A1>  struct container<A1,25>{   void callMe(){      cout << "partial specialization" << typeid(A1).name() << " and 25 " << endl;   }};template <> struct container<t3,99>{   void callMe(){      cout << "complete specialization t3, 99" << endl;   }};int main(void){   container<t1,10> test1;   test1.callMe();   container<t3,99> test2;   test2.callMe();   container<t2,25> test3;   test3.callMe();   container<t3,25> test4;   test4.callMe();   return 0;}

At runtime

primary A: t1 I: 10

complete specialization t3, 99

partial specializationt2 and 25

partial specializationt3 and 25


In this case there are 3 candidates

  1. template <class A, int I> struct container;
  2. template <class A1> struct container<A1,25>;
  3. template <> struct container<t3,99>;

Candidate 1 is our primary template which has 2 template parameters. Candidate 2 is a partial specialization with 1 template parameter. Candidate 3 is a complete specialization with no template parameters.


The term partial specialization is just a fancy way to say that some, but not all of the parameters have real values.


When the compiler encounters container<t3,25> test4; The arguments are <t3,25>

For candidate 1 the compiler can deduce <A=t3, I=25>, so this is a viable candidate.

For candidate 2 the compiler can deduce <A1=t3, 25> for the partial specialization, so this is a viable candidate.

For candidate 3 the compiler cannot possibly get <t3,99> from <t3,25> so this candidate is eliminated.


Candidate 2 is the most specialized specialization.


Here's a slight variation of this example:

#include <iostream>#include <typeinfo>using namespace std;struct t1{}; struct t2{}; struct t3{};template <class A, int I> struct container{   void callMe(){      cout << "primary A: " << typeid(A).name() << " I: " << I << endl;   }};template <int I1>  struct container<t3,I1>{   void callMe(){      cout << "partial specialization t3 and " << I1  << endl;   }};template <> struct container<t3,99>{   void callMe(){      cout << "complete specialization t3, 99 " << endl;   }};int main(void){   container<t1,10> test1;   test1.callMe();   container<t3,99> test2;   test2.callMe();   container<t3,75> test3;   test3.callMe();   container<t3,25> test4;   test4.callMe();   return 0;}

At runtime

primary A: t1 I: 10

complete specialization t3, 99

partial specialization t3 and 75

partial specialization t3 and 25


Internally, the process of matching or selection class template partial specializations is very much like overload resolution. In fact, in the case of more complex partial specializations the compiler may actually turn the partial specialization into a function and directly call overload resolution.


Template argument deduction is used for overload resolution and partial specialization matching.