Suppose you want to write a filter which replaces each tab character with one or more space characters in such a way that the document appears unchanged when displayed. The basic algorithm is as follows: You examine characters one at a time, forwarding them as-is and keeping track of the current column number. When you encounter a tab character, you replace it with a sequence of space characters whose length depends on the current column count. When you encounter a newline character, you forward it and reset the column count.
In the next three sections, I'll express this algorithm as a stdio_filter
, an InputFilter and an OutputFilter. The source code can be found in the header <libs/iostreams/example/tab_expanding_filter.hpp>
. These examples were inspired by James Kanze's ExpandTabsInserter.hh
(see [Kanze]).
tab_expanding_stdio_filter
You can express a tab-expanding Filter as a stdio_filter
as follows:
#include <cstdio> // EOF #include <iostream> // cin, cout #include <boost/iostreams/filter/stdio.hpp> class tab_expanding_stdio_filter : public stdio_filter { public: explicit tab_expanding_stdio_filter(int tab_size = 8) : tab_size_(tab_size), col_no_(0) { assert(tab_size > 0); } private: void do_filter(); void do_close(); void put_char(int c); int tab_size_; int col_no_; }; } } } // End namespace boost::iostreams:example
The helper function put_char
is identical to line_wrapping_stdio_filter::put_char
. It writes a character to std::cout
and updates the column count:
void put_char(int c) { std::cout.put(c); if (c == '\n') { col_no_ = 0; } else { ++col_no_; } }
Using put_char
you can implement do_filter
as follows:
void do_filter() { int c; while ((c = std::cin.get()) != EOF) { if (c == '\t') { int spaces = tab_size_ - (col_no_ % tab_size_); for (; spaces > 0; --spaces) put_char(' '); } else { put_char(c); } } }
The while
loop reads a character from std::cin
and writes it to std::cout
, unless it is a tab character, in which case it writes an appropriate number of space characters to std::cout
.
As with line_wrapping_stdio_filter
, the virtual
function do_close
resets the Filter's state:
void do_close() { col_no_ = 0; }
tab_expanding_input_filter
You can express a tab-expanding Filter as an InputFilter as follows:
#include <boost/iostreams/char_traits.hpp> // EOF, WOULD_BLOCK #include <boost/iostreams/concepts.hpp> // input_filter #include <boost/iostreams/operations.hpp> // get namespace boost { namespace iostreams { namespace example { class tab_expanding_input_filter : public input_filter { public: explicit tab_expanding_input_filter(int tab_size = 8) : tab_size_(tab_size), col_no_(0), spaces_(0) { assert(tab_size > 0); } template<typename Source> int get(Source& src); template<typename Source> void close(Source&); private: int get_char(int c); int tab_size_; int col_no_; int spaces_; }; } } } // End namespace boost::iostreams:example
Let's look first at the helper function get_char
:
int get_char(int c) { if (c == '\n') { col_no_ = 0; } else { ++col_no_; } return c; }
This function updates the column count based on the given character c
and returns c
. Using get_char
you can implement get
as follows:
template<typename Source> int get(Source& src) { if (spaces_ > 0) { --spaces_; return get_char(' '); } int c; if ((c = iostreams::get(src)) == EOF || c == WOULD_BLOCK) return c; if (c != '\t') return get_char(c); // Found a tab. Call this filter recursively. spaces_ = tab_size_ - (col_no_ % tab_size_); return this->get(src); }
The implementation is similar to that of line_wrapping_input_filter::get
. Since get
can only return a single character at a time, whenever a tab character must be replaced by a sequence of space character, only the first space character can be returned. The rest must be returned by subsequent invocations of get
. The member variable spaces_
is used to store the number of such space characters.
The implementation begins by checking whether any space characters remain to be returned. If so, it decrements spaces_
and returns a space. Otherwise, a character is read from src
. Ordinary characters, as well as the special values EOF
and WOULD_BLOCK
, are returned as-is. When a tab character is encountered, the number of spaces which must be returned by future invocations of get is recorded, and a space character is returned.
As usual, the function close
resets the Filter's state:
void close(Source&)
{
col_no_ = 0;
spaces_ = 0;
}
tab_expanding_output_filter
You can express a tab-expanding Filter as an OutputFilter as follows:
#include <boost/iostreams/concepts.hpp> // output_filter #include <boost/iostreams/operations.hpp> // put namespace boost { namespace iostreams { namespace example { class tab_expanding_output_filter : public output_filter { public: explicit tab_expanding_output_filter(int tab_size = 8) : tab_size_(tab_size), col_no_(0), spaces_(0) { assert(tab_size > 0); } template<typename Sink> bool put(Sink& dest, int c); template<typename Sink> void close(Sink&); private: template<typename Sink> bool put_char(Sink& dest, int c); int tab_size_; int col_no_; int spaces_; }; } } } // End namespace boost::iostreams:example
The implemenation helper function put_char
is the same as that of line_wrapping_output_filter::put_char
: it writes the given character to std::cout
and increments the column number, unless the character is a newline, in which case the column number is reset.
template<typename Sink> bool put_char(Sink& dest, int c) { if (!iostreams::put(dest, c)) return false; if (c != '\n') ++col_no_; else col_no_ = 0; return true; }
Using put_char
you can implement put
as follows:
template<typename Sink> bool put(Sink& dest, int c) { for (; spaces_ > 0; --spaces_) if (!put_char(dest, ' ')) return false; if (c == '\t') { spaces_ = tab_size_ - (col_no_ % tab_size_) - 1; return this->put(dest, ' '); } return put_char(dest, c); }
The implementation begins by attempting to write any space characters left over from previously encountered tabs. If successful, it examine the given character c
. If c
is not a tab character, it attempts to write it to dest
. Otherwise, it calculates the number of spaces which must be inserted and calls itself recursively. Using recursion here saves us from having to decrement the member variable spaces_
at two different points in the code.
Note that after a tab character is encountered, get will return false until all the associated space characters have been written.
As usual, the function close
resets the Filter's state:
void close(Source&)
{
col_no_ = 0;
spaces_ = 0;
}
Revised 20 May, 2004
© Copyright Jonathan Turkanis, 2004
Use, modification, and distribution are subject to the Boost Software License, Version 2.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)