User-Defined Formatting in std::format – Part 3

User-Defined Formatting in std::format – Part 3

By Spencer Collyer

Overload, 32(182):4-5, August 2024


We’ve seen formatting for simple classes and more complicated types. Spencer Collyer finishes his series by showing us how to apply specific formatting to existing classes.

In the previous articles in this series [Collyer24a], [Collyer24b] I showed how to write classes to format user-defined classes and container classes using the std::format library.

In this article I will show you how to create format wrappers, special purpose classes that allow you to apply specific formatting to objects of existing classes.

A note on the code listings: The code listings in this article have lines labelled with comments like // 1. Where these lines are referred to in the text of this article it will be as ‘line 1’ for instance, rather than ‘the line labelled // 1’.

Format wrappers

I’d now like to introduce a type of class which I call ‘format wrappers’. A format wrapper is a very simple class which wraps a value of another type. They exist purely so that we can define a formatter for the format wrapper. The idea is that the formatter will then output the wrapped value using a specific set of formatting rules. Hopefully this will become clearer when we discuss the Quoted format wrapper later.

A format wrapper is a very simple class, which normally consists of just a constructor taking an object of the wrapped type, and a public member variable holding a copy or reference to that value. They are intended to be used in the argument list of one of std::format’s formatting functions, purely as a way to select the correct formatter.

Quoted value

In C++14, a new I/O manipulator called quoted was added [CppRef]. When used as an output manipulator, it outputs the passed string as a quoted value. The value output has delimiter characters (the default is ") at the start and end, and any occurrences of the delimiter or the escape character (the default is \) have an extra escape character output before them.

For example, the following shows input strings on the left and what they would be output as on the right:

abcdef → "abcdef"

abc"def → "abc\"def"

ab\cd"ef → "ab\\cd\"ef"

If we want the same ability using std::format, we can create a format wrapper to do the work for us. An example of such a format wrapper and its associated formatter struct is Quoted, which is given in Listing 1, with sample output in Listing 2.

#include <format>
#include <iostream>
#include <string>
#include <string_view>
using namespace std;
struct Quoted   // 1
{
  Quoted(string_view str) // 2
  : m_sv(str)
  {}
  string_view m_sv;  // 3
};
template<>
struct std::formatter<Quoted>
{
  constexpr auto parse(format_parse_context& 
    parse_ctx)
	{
    auto iter = parse_ctx.begin();
    auto get_char = [&]() { return iter 
      != parse_ctx.end() ? *iter : 0; };
    char c = get_char();
    if (c == 0 || c == '}')
    {
      return iter;
    }
    m_quote = c;    // 4
    ++iter;
    if ((c = get_char()) != 0 && c != '}')  // 5
    {
      m_esc = c;
      ++iter;
    }
    if ((c = get_char()) != 0 && c != '}')  // 6
    {
      throw format_error(
        "Invalid Quoted format specification");
    }
    return iter;
  }
  auto format(const Quoted& p, 
    format_context& format_ctx) const
  {
    string_view::size_type pos = 0;
    string_view::size_type end = p.m_sv.length();
    auto out = format_ctx.out();    // 7
    *out++ = m_quote;   // 8
    while (pos < end)   // 9
    {
      auto c = p.m_sv[pos++];
      if (c == m_quote || c == m_esc) // 10
      {
        *out++ = m_esc;
      }
      *out++ = c; // 11
    }
    *out++ = m_quote;   // 12
    return out; // 13
  }
private:
  char m_quote = '"';
  char m_esc = '\\';
};
int main()
{
  cout << format("{}\n",
    Quoted(R"(With "double" quotes)"));
  cout << format("{:'}\n",
    Quoted("With 'single' quotes"));
  cout << format("{:'~}\n",
    Quoted("With 'single' quotes,
    different escape character'));
  cout << format("{:\"}\n",
    Quoted(R"(Mixed "double" and 'single'
    quotes)"));
  cout << format("{}\n", 
    Quoted(R"(Escaped escape character '\')"));
  cout << format("{:\"~}\n",
    Quoted("Escaped escape character '~'"));
}
Listing 1
"With \"double\" quotes"
'With \'single\' quotes'
'With ~'single~' quotes, different escape character’
"Mixed \"double\" and 'single' quotes"
"Escaped escape character '\\'"
"Escaped escape character '~~'"
Listing 2

The Quoted format wrapper

Line 1 starts the format wrapper. Because it is all public, we define it as a struct rather than a class. Line 2 defines the constructor which just copies the given str to m_sv, defined in line 3.

The parse function

The format specification for Quoted has the following form:

  [ quote [ escape ] ]

The quote element is a single character that is the quote character to use as delimiters on the string. If not given it defaults to ".

The escape element is a single character that gives the escape character to use on the string. If not given it defaults to \. Note that you can only give an escape if you have already given a quote.

The first part of the parse function should be familiar from examples in my previous articles.

Line 4 picks up the first character and assigns it as the quote character.

Line 5 checks if we have reached the end of the format-spec, and if not it picks up the next character and assigns it as the escape character.

Line 6 is our normal check for reaching the end of the format-spec.

The format function

Before describing the format function in detail, please be aware that it is not optimised for speed, but has been kept simple as we are just using it as an example1.

Line 7 picks up the current value of the output iterator from format_ctx, and then line 8 writes the delimiter preceding the string to it.

Starting at line 9, we iterate over all the characters in the string. For each character, line 10 checks if it is a quote or escape character, and if so outputs an escape character. Then line 11 outputs the actual character. As noted above, the speed of this loop could be improved although at the cost of making it more complicated.

Finally, line 12 outputs the delimiter after the string, and then line 13 returns the output iterator as the value of the function, as required.

Why use format wrappers?

After reading the description above you will hopefully understand the purpose of format wrappers. But you may ask the question, why would you want to use them? After all, you could just create a function that takes an object of the wrapped value and returns a value which can then be written to the output, without having to create a format wrapper class and the associated formatter struct.

There are two main reasons for using a format wrapper rather than a function, as follows.

  1. If you use a function to do the formatting of the value, you have to return a value which is then output. Using a format wrapper this interim object is not required – the format function for the format wrapper can write the value directly to the output.
  2. If your format wrapper’s formatter allows for different formatting to be applied (as the one for Quoted does), that is simply done using a format-spec in the normal way. If you were to use a function to do the formatting instead, you could of course pass parameters to it indicating any changes required to the formatting, but there are a couple of disadvantages with doing that:
    • The formatting parameters for the value are separate from the format string, meaning anyone reading the code wouldn’t see all the formatting instructions at a glance. You may think this is just an aesthetic problem and you aren’t concerned about it, but to me it looks tidier to have all the formatting instructions in one place.
    • If or when you decide you need to internationalize your program, you may need to handle different formatting for the value, depending upon the expectations of various countries / languages as indicated by the locale. If the formatting instructions are embedded in the format string as a format-spec it is easy for translators to update them as required. However, if you are passing them as parameters to a function, you would need to add code to select the correct values to pass in based on locale.

References

[Collyer24a] Spencer Collyer, ‘User-Defined Formatting in std::format’, Overload 180, April 2024, available at https://accu.org/journals/overload/32/180/collyer/

[Collyer24b] Spencer Collyer, ‘User-Defined Formatting in std::format – Part 2’, Overload 181, June 2024, available at https://accu.org/journals/overload/32/181/collyer/

[CppRef] CPP Reference: std::quoted. Available at https://en.cppreference.com/w/cpp/io/manip/quoted

Footnote

  1. An optimised version would not process the string one character at a time. It would split the string up into substrings delimited by occurrences of the quote and escape characters, and output those substrings using std::format_to. That would generally be faster than checking and outputting each character individually.

Spencer Collyer Spencer has been programming for more years than he cares to remember, mostly in the financial sector, although in his younger years he worked on projects as diverse as monitoring water treatment works on the one hand, and television programme scheduling on the other.






Your Privacy

By clicking "Accept Non-Essential Cookies" you agree ACCU can store non-essential cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

Current Setting: Non-Essential Cookies REJECTED


By clicking "Include Third Party Content" you agree ACCU can forward your IP address to third-party sites (such as YouTube) to enhance the information presented on this site, and that third-party sites may store cookies on your device.

Current Setting: Third Party Content EXCLUDED



Settings can be changed at any time from the Cookie Policy page.