Curiously recurring template pattern and visitor implementation in C++/Java

Curiously recurring template pattern and visitor implementation in C++/Java

The curiously recurring template pattern (CRTP) is a C++ programming idiom that powers the static polymorphism, the policy-based design and many other metaprogramming techniques. The pattern is basically a class inheritance hierarchy in which the super-class has a type parameter that happens to be the sub-class extending it, so that the super-class can refer to the sub-class wherever needed, e.g., in implementing the method chaining/fluent interface. Furthermore, if the super-class is not instantiatable, the this pointer in the super-class can be statically down-casted to a sub-class pointer, because the current instance this points to can only be a sub-class instance.

The pattern is demonstrated using a visitor implementation as follows:

  1. #include<iostream>
  2. using namespace std;
  3.  
  4. struct Visitor;
  5. struct Visitable { virtual void accept(const Visitor& visitor) const = 0; };
  6.  
  7. template<typename T> struct AbstractNode : public Visitable {
  8.   virtual void accept(const Visitor& visitor) const; };
  9.  
  10. struct TextNode : public AbstractNode<TextNode> {}; // CRTP
  11. struct ImageNode : public AbstractNode<ImageNode> {}; // CRTP
  12.  
  13. struct Visitor {
  14.   void visit(const TextNode& textNode) const { cout<<"TextNode."<<endl; }
  15.   void visit(const ImageNode& imageNode) const { cout<<"ImageNode."<<endl; }
  16. };
  17.  
  18. template<typename T>
  19. void AbstractNode<T>::accept(const Visitor& visitor) const {
  20.   visitor.visit(*(static_cast<const T*>(this))); }
  21.  
  22. int main() {
  23.   TextNode textNode;
  24.   ImageNode imageNode;
  25.   Visitor visitor;
  26.   Visitable *visitable = &textNode;
  27.   visitable->accept(visitor); // TextNode.
  28.   visitable = &imageNode;
  29.   visitable->accept(visitor); // ImageNode.
  30.   return EXIT_SUCCESS;
  31. }

where I omitted the virtual destructors, copy constructors, assignment operator overloadings, etc, just for the conciseness of the demonstration.

AbstractNode is the super-class which holds the common accept() method for all its immediate sub-classes. In the accept() method, this is statically casted from AbstractNode<T>* to T*, as T extends AbstractNode<T> (We could have qualified the destructor of AbstractNode as protected, so that this can only point to an instance of the type T, not the type AbstractNode<T>.). Then we pass the dereferenced instance to the visitor to accomplish the double-dispatch. Note that because of the reification implementation of the C++ templates, for each T extending AbstractNode<T>, the compiler will generate an AbstractNode<T>, in which the specific overloading Visitor::visit(const T&) among the visitor methods is statically bound in the accept() method (Method overloading is compile-time, that's why the double dispatch here is perceived to be simulated.). Thanks to the reification, we only have to write the boilerplate accept() method once.

A similar visitor implementation in Java is also shown:

  1. package me.shanhe.pattern;
  2.  
  3. public class CuriouslyRecurringTemplatePattern {
  4.   public static abstract class AbstractNode<T extends AbstractNode<T>> {
  5.     protected String name;
  6.     public abstract void build();
  7.     @SuppressWarnings("unchecked")
  8.     public T setName(final String name) {
  9.       this.name = name;
  10.       return (T) this; // valid as long as T is a sub-class of AbstractNode<T>.
  11.     }
  12.     @SuppressWarnings("unchecked")
  13.     public void accept(final Visitor visitor) {
  14.       visitor.visit((T) this); // won't work, T is bound to AbstractNode<T>.
  15.     }
  16.   }
  17.  
  18.   public static class TextNode extends AbstractNode<TextNode> {
  19.     private String text;
  20.     public TextNode setText(final String text) {
  21.       this.text = text; return this; }
  22.     @Override
  23.     public void build() { System.out.println(name + ": " + text + "."); }
  24.   }
  25.  
  26.   public static class ImageNode extends AbstractNode<ImageNode> {
  27.     private String source;
  28.     public ImageNode setSource(final String source) {
  29.       this.source = source; return this; }
  30.     @Override
  31.     public void build() { System.out.println(name + ": " + source + "."); }
  32.   }
  33.  
  34.   public static class Visitor {
  35.     public void visit(final AbstractNode<?> node) {
  36.       System.out.println("AbstractNode."); }
  37.     public void visit(final TextNode node) { System.out.println("TextNode."); }
  38.     public void visit(final ImageNode node) {System.out.println("ImageNode.");}
  39.   }
  40.  
  41.   public static void main(final String[] args) {
  42.     new TextNode().setName("name").setText("text").build(); // name: text.
  43.     new ImageNode().setName("name").setSource("src").build(); // name: src.
  44.     new TextNode().accept(new Visitor()); // AbstractNode.
  45.     new ImageNode().accept(new Visitor()); // AbstractNode.
  46.   }
  47. }

AbstractNode<T> holds the common name field for the sub-classes, and provides a setter for that field. The setter down-casts this and returns, so that the setter can be chained with other methods in a sub-class. AbstractNode<T> also holds the common accept() method for the sub-classes, attempting to do the same thing as we did in the implementation in C++. However, what the accept() method actually does is different from the intention.

The difference comes from the fact that Java implements generics using the erasure, not the reification in C++. Generic classes in Java are not templates for the compiler to generate code, but classes requesting compile-time type checking. At compile-time, the compiler does the type checking, and erases type parameters in the classes by replacing the parameters with the topmost Object type. There should also be implicit type-castings happening at runtime from the Object type to the actual types of formal type parameters. Though these castings are down-castings, they have been checked by the compiler and should never throw exceptions. The erasure is a design decision made for the bytecode-level backward compatibility. It does not generate a copy of a generic class per actual type of a formal type parameter in the class. There is always only the class itself. Practically, if we recognize the generics in Java as mere compiler checked type-castings, it would be straightforward to understand a lot of its behavior and limitations. This recognition also suggests that whenever we are doing explicit type-castings, e.g., using raw types, we should consider using generics before actually doing those.

Now let us go back to the accept() method in the above example. The problem is that the accept() method uses method overloading to simulate double dispatch, which is static. But according to the erasure, there is only one accept() method; There is no way for the only accept() method to statically bind different visitor methods. What the compiler indeed does is that it sees T being bounded by AbstractNode<T> in the type parameter declaration, so it chooses the overloading visit(AbstractNode<?>) to bind. Were there no bound for T, compile should fail. (By the way, note that the type-casting does not really happen at runtime where there is (T). Because the L-value of the casting should be of type T, which is the topmost Object at runtime. The casting does not need to happen, as everything is an Object. However, the compiler generated implicit type-casting does happen and possibly throws exceptions.)

To fix the implementation in Java, we have to create an accept() method for each of the sub-classes. Although the method body is the same for all the sub-classes, each of the method statically binds a different visitor overload. Which specific visitor overload to call is determined dynamically at runtime by the late-binding through the accept() methods. This is literally the standard visitor pattern. An alternative fix is to use reflection, but it comes with performance tradeoffs. I personally favor the former fix.

Comments

you are brilliant. by Anonymous (not verified)

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

More information about formatting options

To prevent automated spam submissions leave this field empty.