ACCU Home page ACCU Conference Page
Search Contact us ACCU at Flickr ACCU at GitHib ACCU at Google+ ACCU at Facebook ACCU at Linked-in ACCU at Twitter Skip Navigation

pinPolynomial Classes

Overload Journal #69 - Oct 2005 + Programming Topics   Author: William Hastings

A multivariate polynomial in several variables, for example, 7 + 4x2y3 - 5yz6, may be thought of as a polynomial in z with coefficients that are themselves polynomials in x and y. These coefficients, in turn, are polynomials in y with coefficients that are polynomials in x. How can we implement this recursive definition in C++? In this installment and the next we will explore two distinct solutions. In the first solution the number of variables is built into the definition of the class. In the second, the number of variables may vary at runtime. Before proceeding, note that similar issues arise if we define a multi-dimensional array as an array of arrays (which is useful if the number of non-zero entries is relatively small).

The Straightforward Approach

If we know the number of variables at compile time, then the overall structure of a possible solution is straightforward. Suppose we define a template class

template <typename C> class polynomial {...};

where C is the type of the coefficients. This class is a container of pairs of the form [power, coefficient]. For the polynomial above, we have two pairs, [0, 7 + 4x2y3] and [6, -5y]. A polynomial in three variables with numerical coefficients of type double has type

typedef class polynomial<polynomial<polynomial
   <double> > > poly3_type;

A useful way to define poly3_type is:

typedef class polynomial<double> poly1_type;
typedef class polynomial<poly1_type> 
   poly2_type;
typedef class polynomial<poly2_type> 
   poly3_type;

We can write

variable_list<char> vars3("xyz");
poly3_type p(vars3), q(vars3);
p.read("5-3x^2y^4+x^3z^3");
std::cin >> q;
std::cout << "p has " << p.number_terms() 
   << "terms.\n";
std::cout << "p+q=" << p + q << '\n';

Note that the caret (^) means exponentiation, so y^3 means y*y*y. The role of class variable_list will be explained below.

Problems With This Approach

This approach works well, but there are a few problems to overcome. First, suppose we want our polynomial class to have a method that returns the number of terms. In our example polynomial there are three terms. This is easy to compute - just sum the number of terms in each coefficient:

template <typename C>
int polynomial<C>::number_terms() const
{
  int count = 0;
  for (const_iterator it(begin());
   it != end(); ++it)
    count += it->coefficient().number_terms();
  return count;
}

Here, the method coefficient() returns a const reference to the second element of a pair in our container. The problem is that this code will not compile. Note that the call to number_terms() inside the for loop is not a true recursive call. For example, if we call this method for a polynomial of type poly3_type, then it >coefficient() has type poly2_type and so we are calling poly2_type::number_terms, which in turn calls poly1_type::number_terms. It is this last call that causes the problem, since now it >coefficient() has type double, and type double has no method named number_terms.

There are at least three ways to deal with this problem. First, we could use partial specialization to handle the end case. Having defined

template <typename C> class polynomial;

we can now define, for example,

template <> class polynomial<double>;

But this requires specializing class polynomial for each possible numerical type, for example,

template <>
   class polynomial<std::complex<double> >;

A better way is to turn this approach on its head and define the partial specialization for the case where the coefficients are themselves polynomials:

template <typename C>
   class polynomial<polynomial<C> > 
{ . . . };

With this approach, we have

template <typename C> class polynomial {
//Coefficients are numeric
public:
  int number_terms() const {
    int count = 0;
    for (const_iterator it(begin());
       it != end(); ++it)
      if (it->coefficient() != 0)
        ++count;
    return count;
  }
  //other methods . . .
}

and the partial specialization

template <typename C> 
   class polynomial<polynomial<C> > {
//Coefficients are polynomials
public:
  int number_terms() const {int count = 0;
    for (const_iterator it(begin());
       it != end(); ++it)
      count +=
         it->coefficient().number_terms();
    return count;
  }
  //other methods . . .
}

Using a Traits Class

The objection to this approach is that we must repeat all of the code in the definition of the class and in its partial specialization, even if the code is identical. A variant on this approach is to use a traits class [Alexandrescu-]. We define (for numeric coefficients)

template <typename C> class coefficient_traits {
public:
  static int number_terms(C const &c)
  { return c != 0 ? 1 : 0; }
  //other methods . . .
};

As above, we need to specialize this for the case of a multivariate polynomial:

template <typename C>
class coefficient_traits<polynomial<C> >
{
public:
  typedef polynomial<C> coef_type;
  static int number_terms(coef_type const &c)
  {
    int count = 0;
    for (coef_type::const_iterator
       it(c.begin()); 
       it != c.end(); ++it) 
      count +=
         it->coefficient().number_terms(); 
    return count;
  }
  //other methods . . .
};

Then we can define number_terms() in class polynomial by

int number_terms() const {
  int count = 0;
  for (const_iterator it(begin());
     it != end(); ++it)
    count += coefficient_traits<C>::
       number_terms(it->coefficient());
  return count;
}

Of course, the traits class can be used to record other properties. Suppose we have a template class called numeric_traits that includes all the methods required for coefficient_traits with definitions that are appropriate for most numeric types. Then we define coefficient_traits by

template <typename T>
class coefficient_traits 
: public numeric_traits<T> { };//Default case

Of course, we must provide a partial specialization for coefficients that are polynomials:

template <typename C>
class coefficient_traits<polynomial<C> > {...};

We can also specialize as needed for numeric coefficients. For example, to output the polynomial x - 3y, we need to know that the coefficient of y is negative. (Otherwise, we would output x + -3y.) Now for a coefficient c of type double, we can ask if c<0, but not if c has type complex. Therefore, we include in coefficient_traits a method

static bool is_negative(C const &c);

For C of type std::complex<T>, we can override this method in an appropriate way.

One possibility is:

template <typename T>
class coefficient_traits< std::complex<T> >
: public numeric_traits<std::complex<T> >
{
public:
  static bool is_negative(coef_type const & c)
  { return c.imag() == 0 && c.real() < 0; }
  //other overrides
};

Numeric versus Polynomial Coefficients - a Third Way

Let's get back to our original problem, namely how to differentiate between coefficients that are polynomials and those that are numeric. Our third approach relies on the fact that in a template class, a method that is not invoked must not be instantiated. With this approach the method number_terms calls one of two methods depending on the coefficient type. For this to work, the choice must be made at compile time. To this end we use a technique described in Alexandrescu's book [Alexandrescu]. First an auxiliary class will be used to distinguish the two cases:

template < bool b > struct Bool2Type { };

Now, define in class polynomial<C>

int number_terms() const
{
  return number_terms(
    Bool2Type<(nbr_vars==1)>());
}

int number_terms(Bool2Type<false>) const
{
  int count = 0;
  for (const_iterator it(begin());
      it != end(); ++it)
    count += it->coefficient().number_terms();
  return count;
}

int number_terms(Bool2Type<true>) const
{
  int count = 0;
  for (const_iterator it(begin());
      it != end(); ++it)
    if (it->coefficient() != 0)
      ++count;
  return count;
}

Depending on the type C, exactly one of the latter two methods will be invoked; the other will not be instantiated, so there will be no compile-time error. The compile-time constant nbr_vars has other uses as we will see below. It is defined by

namespace detail
{
  //Count the number of variables in a poly
  template <typename C>//Base case
  struct var_count
  {
    static const int nbr_vars = 0;
  };
  template <typename C>
  struct var_count<polynomial<C> >
  {
    static const int nbr_vars =
       1 + var_count<C>::nbr_vars;
  };
}

and then in class polynomial<C>:

static const int nbr_vars =
  detail::var_count<polynomial<C> >::nbr_vars;

In class polynomial most of the methods do not depend on the coefficient type; for the few that do, I have used both the traits class and method pairs using Bool2Type.

Variables

Before proceeding with how class polynomial is constructed, we need to consider variable names and polynomials with different numbers of variables. The representation of polynomials we are examining here implies an ordering of the variables. For example, we may think of p as a polynomial in y with coefficients that are polynomials in x. Of course, we could just as well think of p as a polynomial in x with coefficients in y. What is essential, of course, is consistency. To compute p + q , for example, we want p and q to have variables in the same order. Furthermore, to simplify input it is useful to know in advance what the variable names are.

Let's put the responsibility for maintaining a collection of variable names with order in a separate class

template <typename NameType> 
   class variable_list;

This class is just an ordered container of names of type NameType (e.g. char) [var_list]. Its primary constructor is

variable_list(char const * vars);

where the character string vars might be "xyz". The constructor will transform vars into an ordered list. (By default, this is reverse alphabetical order.) So, for example,

variable_list<char> vars1("xyz");

and

variable_list<char> vars1("zyx");

are equivalent.[1]

Now, to construct polynomials we have

typedef variable_list<char> var_list_type;
var_list_type vars3("xyz"), vars2("xy");
poly2_type p(vars2);
poly3_type q(vars3);

Polynomial Arithmetic

Suppose we want to compute q -= p. (We use subtraction as an example since p - q <> q - p , and so it is slightly more complicated than addition.) Mathematically, q - p makes sense as a polynomial whatever the variables of p and q. It will have three variables if the variables of p are also variables of q. For example, if p = x + z and q = xyz, then q - p is x + z - xyz. On the other hand, if p = w + x, then we have a problem with q -= p, since q - p now has four variables while the number of variables of q is fixed at three. (We'll see how to solve this problem in part 2.)

In any event, I would like to be tolerant and allow p and q to have different variables, or at least a different number of variables. This implies that we need to declare the subtraction operator by

template <typename C1, typename C2>
????? operator -(polynomial<C1> const & p,
                 polynomial<C2> const & q);

The fun begins with the return type. If C1 and C2 are not the same, the return type should be the type of the argument that has more variables. For example, the return type of p - q (of types poly2_type and poly3_type, as above) should be poly3_type. Here are two ways to deal with this problem. Both approaches need to compare the number of variables of types polynomial<C1> and polynomial<C2>, so let's encapsulate this information in a class:

namespace detail
{
template
<typename C1, typename C2, bool b>
struct pick_poly //case b is false
  { typedef polynomial<C2> type; };

template <typename C1, typename C2>
struct pick_poly<C1, C2, true> 
  { typedef polynomial <C1> type; };

template <typename C1, typename C2>
struct compare_coefs {
  static const bool use_c1 =
    (polynomial<C1>::nbr_vars >=
     polynomial <C2>::nbr_vars);

  typedef typename
  pick_poly<C1,C2,use_c1>::type return_type;
};
}

Here the constant use_c1 is true if polynomial<C1> has at least as many variables as polynomial<C2> and compare_coefs<C1,C2>::return_type is either polynomial<C1> or polynomial<C2>, depending on which has more variables. Hence this type should be the return type of operator -. Our first approach to defining subtraction uses Bool2Type defined above:

namespace detail {
  template <typename C1, typename C2>
  polynomial<C1> do_subtract(
   polynomial<C1> p,
   polynomial<C2> const & q,
   Bool2Type<true>)
  {
    return p -= q;
  }
  template <typename C1, typename C2>
  polynomial<C2> do_subtract(
   polynomial<C1> const & p,
   polynomial<C2> const & q,
   Bool2Type<false>)
  {
//create a polynomial of type polynomial<C2>:
    return -q += p;
  }
}
template <typename C1, typename C2>
typename detail::compare_coefs<C1,C2>
               ::return_type
operator-(polynomial<C1> const &p,
          polynomial<C2> const &q)
{
  return detail::do_subtract(p, q,
    Bool2Type<detail::compare_coefs<C1,C2>
                    ::use_c1>());
}

The third argument to do_subtract ensures that we return a polynomial of the correct type. Note that if p and q always have the same number of variables, then the second version of do_subtract will not be instantiated. Also note that the types of the first argument in the two versions of do_subtract are slightly different: the first takes a polynomial by value, the second by const reference. To see why, examine the bodies of the two versions.

Using enable_if

The second approach uses boost::enable_if, which is based on the SFINAE [enable_if] principle. Two versions of operator - are defined, but only one version is valid:

template <typename C1, typename C2>
typename
boost::enable_if_c<detail::compare_coefs
   <C1,C2>::use_c1, polynomial<C1> >::type
operator -(polynomial<C1> p,
  polynomial<C2> const & q)
{
  return p -= q;
}
template <typename C1, typename C2>
typename
boost::disable_if_c<detail::compare_coefs
      <C1,C2>::use_c1, polynomial<C2> >::type
operator-(polynomial<C1> const &p, 
          polynomial<C2> const &q)
{
  return -q += p;
}

Suppose use_c1 is true. Then in the second version of operator -, disable_if_c<…>::type is undefined, while in the first version, enable_if_c<…>::type is its second template argument, i.e. polynomial<C1>. By SFINAE the second version is not considered by the compiler when it looks for an appropriate candidate for operator-= (because the return type does not make sense). If use_c1 is false, then the first version is invalid and in the second, disable_if_c<…>::type is its second template argument, i.e. polynomial<C2>.

Handling Different Variables

We are now ready to consider how to deal with polynomials in different variables. Before proceeding, note that our representation of a polynomial implies an ordering of the variables. The polynomial p= x + y + z, for example, is a polynomial in z with coefficients that are polynomials in x and y. These coefficients, in turn, are polynomials in y with coefficients that are polynomials in x. Of course, we could have used any other order. (For example, we could represent p as a polynomial in x with coefficients in y and z.) To the extent possible, our code should not make assumptions about this order. One assumption we will make, however, is that if two polynomials p and q appear in an arithmetic expression, then the ordering of the variables in p and q agree. For example, suppose p has variables z, x, and w (in this order) and q has variables z, y and x. The ordering of the variables in q must have z before x. (The variable y, however, could be first, second or third.)

Let's look at the implementation of the member function

template <typename C>
template <typename C2>
inline polynomial<C> &
polynomial<C>::operator -= 
        (polynomial<C2> const & q) {
  //. . .
  return *this;
}

If the variables of q are a subset of the variables of p, then computing p-=q is straightforward. We can relax this requirement somewhat: if q is not constant with respect to some variable, then that variable must be a variable of p. If, as in the last paragraph, p has variables z, x and w and q has variables z, y and x, then q must be constant in y (i.e. no term of q contains a non-zero power of y).

Now, there are two ways to handle the case where p and q have different variables. In the first case, we assume that what must be true is true:

template <typename C>
template <typename C2>
inline polynomial<C> &
polynomial<C>::operator -= (polynomial<C2> const & q) {
  //Compute p -= q
  if (this->first_variable() == 
      q.first_variable()) {
    //subtract two polys in same (first) 
    //variable:
    // . . .
  }
  else if (q.is_constant_in_first_variable())
    *this -= q[0];
  else {
    (*this)[0] -= q;
  }
  return *this;
}

Some explanation is in order. Let's suppose that the first variable of q is z. The method first_variable, returns a name, so q.first_variable() returns 'z'. The expression q[n] is the coefficient of the nth power of z (because z is the first variable of q), so q[0] is the constant part (with respect to z) of q. If q is constant in z, then q and q[0] are the same polynomial, so *this -= q[0] makes sense in the middle branch.

The last branch is more subtle. Since q is not constant in its first variable z, the variable z must be a variable of p, for otherwise this subtraction must fail. But p and q do not have the same first variable, so the first variable of p is not a variable of q. (It is here that we are using the assumption that the order of the variables of p and q are compatible, as described above.) In short, the subtraction will work in this case only if the variables of q are variables of p[0].

What happens if our assumptions do not hold? That is, what happens if q is not constant in some variable that is not a variable of p? The key is to examine the call in the last branch. If p and q have three variables (e.g. have type poly3_type), then (*this)[0] -= q invokes operator -= with arguments of types poly2_type and poly3_type, which in turn will invoke operator -= with arguments of types poly1_type and poly3_type. In this last invocation, (*this)[0] has type double. In order for this to compile, we need to define something like the following. (The actual signature is a little more complicated since the numerical coefficients may have type other than double.)

template <typename C> inline
double & operator -= (double & x,
                      const polynomial<C> & q)
{
  if ( ! q.is_constant())
    throw std::domain_error(
        "Different variables in subtraction");
  x -= q.leading_constant();
  return x;
}

Here is where the error is detected. The method leading_constant returns the numerical coefficient of the first term (i.e. the "leading" term). When q is constant, this method returns that constant.

Code Explosion

There is a serious problem with this approach. Even though our implementation of operator -= may seem compact, in fact it forces the instantiation of several versions of this operator. For example, if p and q both have type poly3_type (i.e. polynomials in 3 variables), then fifteen versions of operator -= are instantiated. For example, the statement

(*this)[0] -= q;

means that operator -= is instantiated for arguments of types poly2_type and poly3_type. We saw above that the statement

*this -= q[0];

invokes operator -= with arguments of types poly3_type and poly2_type. In short, we have every pair of arguments of polynomial type with 0 to 3 variables (except 0 and 0, i.e. double operator-=(double,double).) All of this for just one of the basic arithmetic operators to handle cases that may never arise!

There is a remedy for this combinatorial explosion of code. We are requiring that the result of p -= q be a polynomial with the same variables as p. This is possible only if q can be expressed as a polynomial in the same variables as p. In C++ terms this means that we can construct a polynomial that equals q mathematically but has the same type and variables as p:

template <typename C>
template <typename C2>
inline polynomial<C> &
polynomial<C>::operator -= (polynomial<C2>
   const & q)
{
  if (first_variable() == q.first_variable())
  {
    //subtract two poly in the same (first)
    //variable: . . .
  }
  else if ( ! q.is_zero()) {
    try {
      polynomial<C> qq(build_variable_list(),
         q);
      //Compute p -= qq . . .
    }
    catch (construction_error const &) {
      throw std::domain_error (
      "Different variables in subtraction");
    }
  }
  return *this;
}

The key element here is in the call to the constructor inside the try block. The method build_variable_list() returns the variables of *this. The constructor creates a polynomial with these variables and then attempts to copy the terms of q into this polynomial. If q has terms involving nonzero powers of some other variable, the attempt fails and a construction_error exception is thrown. With this approach we do not get the cascading calls to distinct versions of operator -= (assuming that the user does not try to subtract polynomials of different types.)

Conclusion

This polynomial class (and two other versions) are available online at http://www.fordham.edu/mathematics/hastings/polynomialclasses/.

The class described in this article is called fixed_poly<C> to emphasize that the number of variables is fixed at compile time. It is defined in FixedPoly.hpp and is available at this web address. Also at this address are several other necessary include files and one source file.

In the next installment, I examine how we can modify the approach described here to allow the number of variables to vary at run time.

References

[Alexandrescu-] Alexandrescu, Andrei and Herb Sutter C++ Coding Standards, item 65.

[Alexandrescu] Alexandrescu, Andrei Modern C++ Design, pp29ff.

[var_list] For details see http://www.fordham.edu/mathematics/hastings/polynomialclasses/var_list.htm

[enable_if] See http://www.boost.org/libs/utility/enable_if.html



[1] This default behaviour is easy to change.

Overload Journal #69 - Oct 2005 + Programming Topics