A multivariate polynomial in several variables, for example, 7 +
4x^{2}y^{3} - 5yz^{6}, may be thought of as
a polynomial in z with coefficients that are themselves polynomials
in x and y. These coefficients, in turn, are polynomials in y with
coefficients that are polynomials in x. How can we implement this
recursive definition in C++? In this installment and the next we
will explore two distinct solutions. In the first solution the
number of variables is built into the definition of the class. In
the second, the number of variables may vary at runtime. Before
proceeding, note that similar issues arise if we define a
multi-dimensional array as an array of arrays (which is useful if
the number of non-zero entries is relatively small).

If we know the number of variables at compile time, then the overall structure of a possible solution is straightforward. Suppose we define a template class

template <typename C> class polynomial {...};

where C is the type of the coefficients. This class is a
container of pairs of the form [power, coefficient]. For the
polynomial above, we have two pairs, [0, 7 +
4x^{2}y^{3}] and [6, -5y]. A polynomial in three
variables with numerical coefficients of type `double` has type

typedef class polynomial<polynomial<polynomial <double> > > poly3_type;

A useful way to define `poly3_type` is:

typedef class polynomial<double> poly1_type; typedef class polynomial<poly1_type> poly2_type; typedef class polynomial<poly2_type> poly3_type;

We can write

variable_list<char> vars3("xyz"); poly3_type p(vars3), q(vars3); p.read("5-3x^2y^4+x^3z^3"); std::cin >> q; std::cout << "p has " << p.number_terms() << "terms.\n"; std::cout << "p+q=" << p + q << '\n';

Note that the caret (^) means exponentiation, so y^3 means
y*y*y. The role of `class variable_list`
will be explained below.

This approach works well, but there are a few problems to overcome. First, suppose we want our polynomial class to have a method that returns the number of terms. In our example polynomial there are three terms. This is easy to compute - just sum the number of terms in each coefficient:

template <typename C> int polynomial<C>::number_terms() const { int count = 0; for (const_iterator it(begin()); it != end(); ++it) count += it->coefficient().number_terms(); return count; }

Here, the method `coefficient()`
returns a `const` reference to the second
element of a pair in our container. The problem is that this code
will not compile. Note that the call to `number_terms()` inside the `for` loop is not a true recursive call. For example,
if we call this method for a polynomial of type `poly3_type`, then `it
>coefficient()` has type `poly2_type`
and so we are calling `poly2_type::number_terms`, which in turn calls
`poly1_type::number_terms`. It is this
last call that causes the problem, since now `it
>coefficient()` has type `double`, and
type `double` has no method named `number_terms`.

There are at least three ways to deal with this problem. First, we could use partial specialization to handle the end case. Having defined

template <typename C> class polynomial;

we can now define, for example,

template <> class polynomial<double>;

But this requires specializing class polynomial for each possible numerical type, for example,

template <> class polynomial<std::complex<double> >;

A better way is to turn this approach on its head and define the partial specialization for the case where the coefficients are themselves polynomials:

template <typename C> class polynomial<polynomial<C> > { . . . };

With this approach, we have

template <typename C> class polynomial { //Coefficients are numeric public: int number_terms() const { int count = 0; for (const_iterator it(begin()); it != end(); ++it) if (it->coefficient() != 0) ++count; return count; } //other methods . . . }

and the partial specialization

template <typename C> class polynomial<polynomial<C> > { //Coefficients are polynomials public: int number_terms() const {int count = 0; for (const_iterator it(begin()); it != end(); ++it) count += it->coefficient().number_terms(); return count; } //other methods . . . }

The objection to this approach is that we must repeat all of the code in the definition of the class and in its partial specialization, even if the code is identical. A variant on this approach is to use a traits class [Alexandrescu-]. We define (for numeric coefficients)

template <typename C> class coefficient_traits { public: static int number_terms(C const &c) { return c != 0 ? 1 : 0; } //other methods . . . };

As above, we need to specialize this for the case of a multivariate polynomial:

template <typename C> class coefficient_traits<polynomial<C> > { public: typedef polynomial<C> coef_type; static int number_terms(coef_type const &c) { int count = 0; for (coef_type::const_iterator it(c.begin()); it != c.end(); ++it) count += it->coefficient().number_terms(); return count; } //other methods . . . };

Then we can define `number_terms()` in
class polynomial by

int number_terms() const { int count = 0; for (const_iterator it(begin()); it != end(); ++it) count += coefficient_traits<C>:: number_terms(it->coefficient()); return count; }

Of course, the traits class can be used to record other
properties. Suppose we have a template class called `numeric_traits` that includes all the methods
required for `coefficient_traits` with
definitions that are appropriate for most numeric types. Then we
define `coefficient_traits` by

template <typename T> class coefficient_traits : public numeric_traits<T> { };//Default case

Of course, we must provide a partial specialization for coefficients that are polynomials:

template <typename C> class coefficient_traits<polynomial<C> > {...};

We can also specialize as needed for numeric coefficients. For
example, to output the polynomial x - 3y, we need to know that the
coefficient of y is negative. (Otherwise, we would output x + -3y.)
Now for a coefficient c of type double, we can ask if c<0, but
not if c has type complex. Therefore, we include in `coefficient_traits` a method

static bool is_negative(C const &c);

For C of type `std::complex<T>`, we
can override this method in an appropriate way.

One possibility is:

template <typename T> class coefficient_traits< std::complex<T> > : public numeric_traits<std::complex<T> > { public: static bool is_negative(coef_type const & c) { return c.imag() == 0 && c.real() < 0; } //other overrides };

Let's get back to our original problem, namely how to
differentiate between coefficients that are polynomials and those
that are numeric. Our third approach relies on the fact that in a
template class, a method that is not invoked must not be
instantiated. With this approach the method `number_terms` calls one of two methods depending
on the coefficient type. For this to work, the choice must be made
at compile time. To this end we use a technique described in
Alexandrescu's book [Alexandrescu].
First an auxiliary class will be used to distinguish the two
cases:

template < bool b > struct Bool2Type { };

Now, define in class `polynomial<C>`

int number_terms() const { return number_terms( Bool2Type<(nbr_vars==1)>()); } int number_terms(Bool2Type<false>) const { int count = 0; for (const_iterator it(begin()); it != end(); ++it) count += it->coefficient().number_terms(); return count; } int number_terms(Bool2Type<true>) const { int count = 0; for (const_iterator it(begin()); it != end(); ++it) if (it->coefficient() != 0) ++count; return count; }

Depending on the type C, exactly one of the latter two methods
will be invoked; the other will not be instantiated, so there will
be no compile-time error. The compile-time constant `nbr_vars` has other uses as we will see below. It is
defined by

namespace detail { //Count the number of variables in a poly template <typename C>//Base case struct var_count { static const int nbr_vars = 0; }; template <typename C> struct var_count<polynomial<C> > { static const int nbr_vars = 1 + var_count<C>::nbr_vars; }; }

and then in class `polynomial<C>`:

static const int nbr_vars = detail::var_count<polynomial<C> >::nbr_vars;

In class `polynomial` most of the
methods do not depend on the coefficient type; for the few that do,
I have used both the traits class and method pairs using `Bool2Type`.

Before proceeding with how class `polynomial` is constructed, we need to consider
variable names and polynomials with different numbers of variables.
The representation of polynomials we are examining here implies an
ordering of the variables. For example, we may think of p as a
polynomial in y with coefficients that are polynomials in x. Of
course, we could just as well think of p as a polynomial in x with
coefficients in y. What is essential, of course, is consistency. To
compute p + q , for example, we want p and q to have variables in
the same order. Furthermore, to simplify input it is useful to know
in advance what the variable names are.

Let's put the responsibility for maintaining a collection of variable names with order in a separate class

template <typename NameType> class variable_list;

This class is just an ordered container of names of type
`NameType` (e.g. `char`)
[var_list]. Its primary constructor is

variable_list(char const * vars);

where the character string `vars` might
be `"xyz"`. The constructor will transform
`vars` into an ordered list. (By default,
this is reverse alphabetical order.) So, for example,

variable_list<char> vars1("xyz");

and

variable_list<char> vars1("zyx");

are equivalent.^{[1]}

Now, to construct polynomials we have

typedef variable_list<char> var_list_type; var_list_type vars3("xyz"), vars2("xy"); poly2_type p(vars2); poly3_type q(vars3);

Suppose we want to compute q -= p. (We use subtraction as an example since p - q <> q - p , and so it is slightly more complicated than addition.) Mathematically, q - p makes sense as a polynomial whatever the variables of p and q. It will have three variables if the variables of p are also variables of q. For example, if p = x + z and q = xyz, then q - p is x + z - xyz. On the other hand, if p = w + x, then we have a problem with q -= p, since q - p now has four variables while the number of variables of q is fixed at three. (We'll see how to solve this problem in part 2.)

In any event, I would like to be tolerant and allow p and q to have different variables, or at least a different number of variables. This implies that we need to declare the subtraction operator by

template <typename C1, typename C2> ????? operator -(polynomial<C1> const & p, polynomial<C2> const & q);

The fun begins with the return type. If C1 and C2 are not the
same, the return type should be the type of the argument that has
more variables. For example, the return type of p - q (of types
`poly2_type` and `poly3_type`, as above) should be `poly3_type`. Here are two ways to deal with this
problem. Both approaches need to compare the number of variables of
types `polynomial<C1>` and
`polynomial<C2>`, so let's
encapsulate this information in a class:

namespace detail { template <typename C1, typename C2, bool b> struct pick_poly //case b is false { typedef polynomial<C2> type; }; template <typename C1, typename C2> struct pick_poly<C1, C2, true> { typedef polynomial <C1> type; }; template <typename C1, typename C2> struct compare_coefs { static const bool use_c1 = (polynomial<C1>::nbr_vars >= polynomial <C2>::nbr_vars); typedef typename pick_poly<C1,C2,use_c1>::type return_type; }; }

Here the constant `use_c1` is true if
`polynomial<C1>` has at least as
many variables as `polynomial<C2>`
and `compare_coefs<C1,C2>::return_type`
is either `polynomial<C1>` or
`polynomial<C2>`, depending on which
has more variables. Hence this type should be the return type of
`operator -`. Our first approach to
defining subtraction uses `Bool2Type`
defined above:

namespace detail { template <typename C1, typename C2> polynomial<C1> do_subtract( polynomial<C1> p, polynomial<C2> const & q, Bool2Type<true>) { return p -= q; } template <typename C1, typename C2> polynomial<C2> do_subtract( polynomial<C1> const & p, polynomial<C2> const & q, Bool2Type<false>) { //create a polynomial of type polynomial<C2>: return -q += p; } } template <typename C1, typename C2> typename detail::compare_coefs<C1,C2> ::return_type operator-(polynomial<C1> const &p, polynomial<C2> const &q) { return detail::do_subtract(p, q, Bool2Type<detail::compare_coefs<C1,C2> ::use_c1>()); }

The third argument to * do_subtract* ensures that we return a
polynomial of the correct type. Note that if p and q always have
the same number of variables, then the second version of
do_subtract will not be instantiated. Also note that the types of
the first argument in the two versions of

*are slightly different: the first takes a polynomial by value, the second by*

`do_subtract``const`reference. To see why, examine the bodies of the two versions.

The second approach uses `boost::enable_if`, which is based on the SFINAE
[enable_if] principle. Two versions of
operator - are defined, but only one version is valid:

template <typename C1, typename C2> typename boost::enable_if_c<detail::compare_coefs <C1,C2>::use_c1, polynomial<C1> >::type operator -(polynomial<C1> p, polynomial<C2> const & q) { return p -= q; } template <typename C1, typename C2> typename boost::disable_if_c<detail::compare_coefs <C1,C2>::use_c1, polynomial<C2> >::type operator-(polynomial<C1> const &p, polynomial<C2> const &q) { return -q += p; }

Suppose `use_c1` is true. Then in the
second version of `operator -`,
`disable_if_c<…>::type` is
undefined, while in the first version, `enable_if_c<…>::type` is its second
template argument, i.e. `polynomial<C1>`. By SFINAE the second version
is not considered by the compiler when it looks for an appropriate
candidate for `operator-=` (because the
return type does not make sense). If `use_c1` is false, then the first version is invalid
and in the second, `disable_if_c<…>::type` is its second
template argument, i.e. `polynomial<C2>`.

We are now ready to consider how to deal with polynomials in different variables. Before proceeding, note that our representation of a polynomial implies an ordering of the variables. The polynomial p= x + y + z, for example, is a polynomial in z with coefficients that are polynomials in x and y. These coefficients, in turn, are polynomials in y with coefficients that are polynomials in x. Of course, we could have used any other order. (For example, we could represent p as a polynomial in x with coefficients in y and z.) To the extent possible, our code should not make assumptions about this order. One assumption we will make, however, is that if two polynomials p and q appear in an arithmetic expression, then the ordering of the variables in p and q agree. For example, suppose p has variables z, x, and w (in this order) and q has variables z, y and x. The ordering of the variables in q must have z before x. (The variable y, however, could be first, second or third.)

Let's look at the implementation of the member function

template <typename C> template <typename C2> inline polynomial<C> & polynomial<C>::operator -= (polynomial<C2> const & q) { //. . . return *this; }

If the variables of q are a subset of the variables of p, then computing p-=q is straightforward. We can relax this requirement somewhat: if q is not constant with respect to some variable, then that variable must be a variable of p. If, as in the last paragraph, p has variables z, x and w and q has variables z, y and x, then q must be constant in y (i.e. no term of q contains a non-zero power of y).

Now, there are two ways to handle the case where p and q have different variables. In the first case, we assume that what must be true is true:

template <typename C> template <typename C2> inline polynomial<C> & polynomial<C>::operator -= (polynomial<C2> const & q) { //Compute p -= q if (this->first_variable() == q.first_variable()) { //subtract two polys in same (first) //variable: // . . . } else if (q.is_constant_in_first_variable()) *this -= q[0]; else { (*this)[0] -= q; } return *this; }

Some explanation is in order. Let's suppose that the first
variable of q is z. The method `first_variable`, returns a name, so `q.first_variable()` returns 'z'. The expression
`q[n]` is the coefficient of the
n^{th} power of z (because z is the first variable of q),
so q[0] is the constant part (with respect to z) of q. If q is
constant in z, then q and q[0] are the same polynomial, so
`*this -= q[0]` makes sense in the middle
branch.

The last branch is more subtle. Since q is not constant in its first variable z, the variable z must be a variable of p, for otherwise this subtraction must fail. But p and q do not have the same first variable, so the first variable of p is not a variable of q. (It is here that we are using the assumption that the order of the variables of p and q are compatible, as described above.) In short, the subtraction will work in this case only if the variables of q are variables of p[0].

What happens if our assumptions do not hold? That is, what
happens if q is not constant in some variable that is not a
variable of p? The key is to examine the call in the last branch.
If p and q have three variables (e.g. have type `poly3_type`), then `(*this)[0] -=
q` invokes `operator -=` with
arguments of types `poly2_type` and `poly3_type`, which in turn will invoke `operator -=` with arguments of types `poly1_type` and `poly3_type`. In
this last invocation, `(*this)[0]` has type
double. In order for this to compile, we need to define something
like the following. (The actual signature is a little more
complicated since the numerical coefficients may have type other
than `double`.)

template <typename C> inline double & operator -= (double & x, const polynomial<C> & q) { if ( ! q.is_constant()) throw std::domain_error( "Different variables in subtraction"); x -= q.leading_constant(); return x; }

Here is where the error is detected. The method `leading_constant` returns the numerical
coefficient of the first term (i.e. the "leading" term). When q is
constant, this method returns that constant.

There is a serious problem with this approach. Even though our
implementation of `operator -=` may seem
compact, in fact it forces the instantiation of several versions of
this operator. For example, if p and q both have type `poly3_type` (i.e. polynomials in 3 variables), then
fifteen versions of `operator -=` are
instantiated. For example, the statement

(*this)[0] -= q;

means that `operator -=` is
instantiated for arguments of types `poly2_type` and `poly3_type`. We saw
above that the statement

*this -= q[0];

invokes `operator -=` with arguments of
types `poly3_type` and `poly2_type`. In short, we have every pair of arguments
of polynomial type with 0 to 3 variables (except 0 and 0, i.e.
`double operator-=(double,double)`.) All
of this for just one of the basic arithmetic operators to handle
cases that may never arise!

There is a remedy for this combinatorial explosion of code. We are requiring that the result of p -= q be a polynomial with the same variables as p. This is possible only if q can be expressed as a polynomial in the same variables as p. In C++ terms this means that we can construct a polynomial that equals q mathematically but has the same type and variables as p:

template <typename C> template <typename C2> inline polynomial<C> & polynomial<C>::operator -= (polynomial<C2> const & q) { if (first_variable() == q.first_variable()) { //subtract two poly in the same (first) //variable: . . . } else if ( ! q.is_zero()) { try { polynomial<C> qq(build_variable_list(), q); //Compute p -= qq . . . } catch (construction_error const &) { throw std::domain_error ( "Different variables in subtraction"); } } return *this; }

The key element here is in the call to the constructor inside
the try block. The method `build_variable_list()` returns the variables of
`*this`. The constructor creates a
polynomial with these variables and then attempts to copy the terms
of q into this polynomial. If q has terms involving nonzero powers
of some other variable, the attempt fails and a `construction_error exception` is thrown. With
this approach we do not get the cascading calls to distinct
versions of `operator -=` (assuming that
the user does not try to subtract polynomials of different
types.)

This polynomial class (and two other versions) are available online at http://www.fordham.edu/mathematics/hastings/polynomialclasses/.

The class described in this article is called `fixed_poly<C>` to emphasize that the number
of variables is fixed at compile time. It is defined in `FixedPoly.hpp` and is available at this web address.
Also at this address are several other necessary include files and
one source file.

In the next installment, I examine how we can modify the approach described here to allow the number of variables to vary at run time.

^{[1]} This default behaviour is easy to
change.

## Overload Journal #69 - Oct 2005 + Programming Topics

Browse in : |
All
> Journals
> Overload
> 69
(6)
All > Topics > Programming (862) Any of these categories - All of these categories |