As the Hexagon Grid becomes popular for machine vision, analysis-synthesis filter banks tailored for that Grid are needed. In this paper we exhibit two perfect reconstruction (P.R.) decompositions for signals on a hexagon lattice. Each example filter bank has better properties than other banks proposed for the hexagon grid, and, it would seem, better properties than any possible comparable filter bank for the square grid.
One example is an orthogonal coding transform with one low-pass and three high-pass filters similar to that used in the Simoncelli-Adelson quadtree pyramid for the hexagon grid. Unlike the Simoncelli-Adelson filter bank, we achieve (exact) perfect reconstruction with compact (and integer-valued) filters. The other example is a 3-ary ``blurring'' filter bank which has a perfect inverse (``sharpening'') filter bank. In prior art, one is usually satisfied with almost-perfect reconstruction for such systems. Since mathematical expressions are somewhat difficult to depict in HTML, this version, while self-contained, omits some of the detailed mathematical descriptions in the original LaTeX version. |
Images are usually sampled on a Square Lattice but a Hexagon Lattice has several important advantages which amply justify its use for many applications (e.g. robot vision, deep-space imaging, biomedical image processing).
In a companion paper (``Decomposition of orthogonal filter banks'') we describe a general method for representing perfect reconstruction (P.R.) multi-dimensional filter banks. In this paper we give two example P.R. decompositions for signals on a hexagon lattice.
To make the paper self-contained, we offer brief practical definitions of filterbank terminology.
A complete filterbank decomposition replaces the N pixels of an image with N filterbank outputs (transform coefficients). Each output is the scalar product of the image and a filter. The locus of non-zero taps in the filter is called its support kernel. The support kernels in our example filters are each of size 16 or 17, about the minimum for a ``high-quality'' filter on a 2-dimensional signal.
A filterbank is compact when all its support kernels have finite size.
A complete compact filterbank offers perfect reconstruction when an inverse complete compact filterbank exists which allows the original signal to be (perfectly) reconstructed from the filterbank outputs. The two filterbanks are called analysis (decomposition) and synthesis (reconstructing inverse).
An analysis filterbank is orthogonal when its (synthesizing) inverse is exact and is simply the transpose of the analysis filterbank. Both our example filterbanks provide perfect reconstruction, but only the first is orthogonal.
In a K-ary filterbank the N filters are derived from K basic filters by shifting. Our two examples are 4-ary and 3-ary filterbanks respectively. Since the N image pixels produce only N / K outputs for each basic filter, K is called the decimation factor.
A K-ary filterbank may have fewer than K basic filters when rotations are considered. For example, the 2-D wavelet decomposition in a method like JPEG-2000 is a 4-ary filterbank, but two of the filters are equivalent after 90^{o} rotation, so it has only three filters distinct after rotation. Our example filterbanks have respectively only two and one filters which are distinct after rotation.
A compaction filterbank is a K-ary analysis filterbank serving the goal of signal data compression. Usually one of the K filters is a ``low-pass'' filter which will carry most of the information and the other (K-1) filters are ``high-pass.''
A pyramidal filterbank is a filterbank system where the low-pass outputs are recursively presented back to the analysis filterbank.
The Quadtree Wavelet Pyramid provides the essential step in many 2-D signal coding methods when signal samples are arranged on a square grid. As Simoncelli and Adelson showed cite{Simoncelli:90a}, Quadtree Pyramids are available for images on a hexagon grid that have better behavior than the square-grid Quadtrees. One nicety is that all three high-pass filters are identical, after rotation by 120^{o} or 240^{o}. Such a filter bank might be called ``quasi-binary'' --- after rotation there are only two basic filters. (The square grid quadtree has two high-pass filters identical after rotation by 90^{o} and a third, trouble-some, high-pass filter called the ``saddle-shape.'') In Section 2 we discuss the advantages of hexagonal sampling, the Simoncellian pyramid and some other design choices.
The popular Simoncelli-Adelson filter bank cite{Simoncelli:90a} uses 37-tap filters and is only nearly-orthogonal. In Section 3 we exhibit a similar 4-ary decomposition with comparable compaction power, better spatial localization (16-tap filters), easier computation and exact orthogonality.
Functions to attenuate (``blur'') or enhance (``sharpen'') high-frequency information are quite useful in signal processing. The two functions are logically the inverse of each other, but sharpen/blur filter banks which provide perfect reconstruction (operate as biorthogonal analysis/synthesis filter bank pairs) are rarely used, and designers are often satisfied with approximate reconstruction.
In Section 4 we exhibit 3-ary ``blurring'' and ``sharpening'' filter banks which are each other's perfect inverses. (Obviously we speak of deliberate digital blurring; there is no perfect inverse for real-world noisy blurring so our method doesn't supplant methods like Wiener filtering.)
The blurring filters have similar nearly circular shapes with one centered at each sample of the signal, and similarly for the sharpening filters. The inverse of a finite non-trivial 1-ary filter bank is always infinite, so in prior art, one is usually satisfied with almost-perfect reconstruction for such systems, but we achieve exact reconstruction with a simple trick: The three basis filters are not identical, although they are very similar and become identical after 120^{o} and 240^{o} rotation. This filter bank might be called ``quasi-unary'' --- after rotation there is but a single basic filter.
The image compaction filter bank espoused in this paper operates on the hexagon grid, is 4-ary, ``Simoncellian'' and has as its support a particular 16-pixel ``snowflake-like'' shape. We will first justify these choices.
The hexagon grid has several major advantages over the square grid:
The final point here applies particularly to Simoncellian decompositions, as we discuss in the next subsection.
Separable 4-ary (quadtree) wavelet coding on the square grid uses a low-pass term (``LL''), two similar high-pass terms (``HL,'' ``LH'' --- horizontal and vertical gradients) and a ``saddle-shape'' term (``HH''). This fourth term is unfriendly, since it combines the leftover energy from disjoint regions of Fourier space cite{Simoncelli:90c}. Compression is achieved by discarding low-energy coefficients but small errors in the ``saddle-shape'' term may cause diagonal edge energy to appear to ``flip'' orientation --- thus a small arithmetic error can cause a severe visual flaw.
On the hexagon grid there are 4-ary image decompositions comprising a low-pass term and three high-pass terms equivalent after rotation by 120^{o}. Such decompositions were introduced by Simoncelli cite{Simoncelli:90a}, so we'll call them Simoncellian. With finer orientation banding and without the ``unfriendly'' saddle-shape, quantization policies for lossy image compression are simpler and perform better than the square-grid counterparts cite{Balak:93} cite{Wegmann:96}.
The Hexagon Grid allows Tritree and Septree Pyramids cite{Burt:80} cite{Burt:81} cite{Golay:69} as well as the Quadtree. Resolution reduction by a factor of seven is too coarse to get much benefit from pyramidalization, but reduction by three might seem appealing. However the Tritree has some serious disadvantages.
The axes of the lower resolution grid in a 4-tree (Quadtree) will be parallel to the original axes, but for the 3- and 7-trees the axes will be rotated by 30^{o} and approx 19.1^{o} respectively. Such rotations might be severely impractical.
The choice of a Quadtree allows one to exploit coding software and visualization tools already developed for the Square Grid. The Quad-tree Pyramid form of a rectangular shaped image is often visualized by dividing the image into four congruent rectangles with the upper-left being a ``thumbnail'' view, and recursively so dividing the thumbnail. This can also be done with a Hexagon Grid image of overall rectangular shape, but there doesn't seem to be a similar visualization method for Tri-tree or Sep-tree Pyramids (even if the overall image were of triangular or heptangular shape).
A Tritree Pyramid would use three basic filters instead of four; the three filters might be a low-pass, and a horizontal and vertical gradient. Such a system seems superior to the ordinary Square-Grid Quadtree Pyramid since the ``saddle-shape'' filter is eliminated, and might be a good choice despite the disadvantages mentioned. Still, the basic gradients of a hexagon-based system ``should'' be offset from each other by 120^{o}, not 90^{o}, so the Simoncellian Quadtree seems more ``natural.''
Use of a quadtree does not directly imply reliance on tiles of specifically four pixels, but their use is almost dictated. The support bases for the filters in 4-ary perfect reconstruction filterbanks may need to be derived from basic 4-pixel subtiles. (The support kernels for Simoncelli's filters were not multiples of four in size, but this filter system did not provide true perfect reconstruction.)
Displayed here are three possible shapes for 4-pixel tiles. Looking from left to right, we will call them the R-Trigon, the Rhombus, and the L-Trigon. (``Trigon'' is an invented word, but it is convenient to have a specific name for this triangle shape of area four.)
We will use the R-Trigon as a tiling primitive, and will need a specific pixel numbering, as shown, for definiteness.
We prefer the Trigon to the rhombus because
Since overlap is an important attribute of good coding transforms, we will combine two or more of these 4-pel basic tiles (``subtiles'') into a larger basic kernel to serve as the support of our highest resolution filters. Seven subtiles (28 pixels) is a plausible choice for a nearly circular support, but for ease of exposition in this paper we propose a simpler kernel using just four subtiles. (It may even be impossible to construct a 28-pel orthogonal filter bank with the excellent symmetries our example will possess.)
Even when one does not overlap tiles, successive layers of a coding pyramid will implicitly use larger tiles: 16, 64, etc. for a Quadtree Pyramid.
In a hierarchy where 16-pel tiles are built from 4-pel tiles, and so on, one need not use the same tiling geometry at each resolution level. In the following figure, the 16-pel shape at the left is an ``R-Trigon of R-Trigons.'' The shape at the right is an ``L-Trigon of R-Trigons.'' (The colors have no significance except to make the subtiling more apparent.)
Having chosen the ``Trigon'' as a basic tile (``subtile''), and a kernel of four such subtiles (16 pixels) as the filter support, we are left with the choice of which four subtiles to include in the kernel. The subtiles form a hexagon grid of one-fourth the original density, so the rhombus and Trigon that we just saw in a preceding Figure are the obvious choices here as well. Most of the reasons we just gave for preferring the Trigon over the rhombus apply again, so we choose a Trigon of Trigons as our basic 16-pixel kernel. There are two additional reasons for preferring that shape:
We still have the choice of either an R-trigon of R-trigons, or an L-trigon of R-trigons. The former has a more pronounced ``jigsaw-like'' shape (it has a ``diffuse perimeter'' in which neighboring tiles will reach into each other like the knobs of a jigsaw puzzle) and would increase the postprocessing advantage just mentioned, but this is more relevant in a block than an overlapped transform, so we use the L-trigon of R-trigons.
Finally we mention the excellent symmetries that Simoncelli's filters, and ours, possess. Most imaging engineers will be intuitively delighted by our low-pass filter with its three axes of symmetry, as well as our Simoncellian high-pass filters which also have an axis of symmetry.
This presents a contrast to the Quadtree separable Wavelet coding used with the square grid: It is a well-known theorem that the only orthogonal symmetric wavelet is the Haar wavelet, which gives quite poor compression performance.
The original LaTeX paper uses this example as a case study in the design of multi-dimensional P.R. filterbanks. In this version of the paper, we abbreviate the derivation of the filterbanks.
We introduce a 4 X 4 block orthogonal transform G_{f}(X) as a building block.
G_{f}(X) = ( | 3f^{2}-1 | 2f | 2f | 2f | ) ^{. } | X / (3f^{2}+1) |
2f | f^{2}+1 | -2f^{2} | -2f^{2} | |||
2f | -2f^{2} | f^{2}+1 | -2f^{2} | |||
2f | -2f^{2} | -2f^{2} | f^{2}+1 |
As described in the LaTeX paper, the parameter f
is related to a Givens angle y by
f = (1 + cos y)/(sqrt(3) sin y)
Using G_{f}(X) as a building block, the
4 X 16 filterbank T(a,b)we introduce for the hexagonal grid can be
described simply:
T(a,b) = G_{b} . Shift . G_{a}
Shift is a 4 X 16 permutation matrix which shifts each assymetric filter output to an adjoining subtile. (The LaTeX paper offers more detail and clarity.) Elsewhere cite{Allen:98} I prove that filterbanks on the Trigon-of-Trigons shape conform to this equation if and only if they satisfy a number of useful symmetry conditions.
The necessary and sufficient condition that
the high-pass filters of T(a,b)
have zero DC response (This is called the DC separation constraint) is
b = (3a+1)/(3-3a)
or
b = (a-1)/(3a+1)
This transform will provide excellent energy compaction
throughout the range -.21 < f_A < -.14.
A particular choice which combines excellent
energy compaction and fast integer computation is
a = -1/6
For this a, either b = 1/7 or b = -7/3 are permitted by the DC separation constraint given above. We adopt the former which gives better localized high-pass filters.
The 4-X-16 Compaction Filter Bank we call T(-1/6, 1/7) comprises the low-pass filter:
. . . -14 . . . . . -84 -14 . -14 . +276 +259 . . -84 +259 +759 +276 . . -14 . +276 +259 . . . . . -84 -14 . . . . -14 . .and three high-pass filters:
. . . +50 . . . . . +300 +50 . -2 . +84 -925 . . -12 +37 +231 +84 . . -2 . +84 +37 . . . . . -12 -2 . . . . -2 . .
The values given are exact. (They must be divided by 1014 to achieve normalization.) The other two filters can be visualized by rotating the High-pass filter 120^{o} and 240^{o}.
The filter banks described in this section are orthogonal. There are similar filter banks which satisfy the same strong symmetry conditions, but which are biorthogonal --- the analysis and synthesis filter banks are different. In general one can design biorthogonal filter banks which provide better compaction than any orthogonal filter bank of the same size cite{Aase:95}, but we have avoided that here to keep our exposition simple.
We will exhibit a 3-ary P.R. filter bank system whose filters all have the same circular shape. It is useful for invertible sharpening or blurring.
The obvious blurring filter for the hexagon grid is the seven-tap kernel comprising a large central weight, and six equal smaller weights at the neighbors. If the smaller weights are negative the filter is a sharpening filter. The seven-tap blurring and sharpening filters are approximately each other's inverse; by adding 11 tiny-valued taps outside the seven-pel central kernel we will make the filters to be exact inverses.
The design of the following filter is similar to the first example,
though it uses an extra Shift stage:
T = H_{a,b} . Shift . H_{c,d} . Shift . H_{e,f}
Another difference is that the synthesis filterbank
differs from the analysis, having the same pattern but
different parameters:
A = (b - a^{2}) / (ab - 1)
B = (a - b^{2}) / (ab - 1)
Apply a 3 X 3 transform to non-overlapping 3-pixel tiles on the hexagon grid; then apply an operator lattice-shift operator. The result is three corresponding 3-tap filters (shown via subscripts 1 to 3) throughout a 9-point tile cluster,
. . a1 . . . b1 11 b3 . . . 12 13 a3 . a2 b2 . .
So far the lattice shift has only relabeled the filters, but now we apply the 3 X 3 filter bank again (replacing (a,b) with (c,d)). The three resultant new filters will be identical after rotation by 120^{o} so we need only show one,
. . a . . . b 1 bc . . . d c ac . ad bd . .
Now repeat this process one more time: lattice-shift the three 9-point filters and blend them, again using the 3 X 3 transform, now parameterized with (e,f). Again we need only show one filter.
. . ade . . . . . . bde de be+acf . . . . . . a+ce bcf+e ae+cf bdf . . . ace+b 1+bce bc+f df adf . . d af+c ac+bf . . . . ad . . . . . .
This 17-pixel quasi-unary filter depends on six free parameters but let's form a simple numeric example by choosing the blurring parameters A = B = C = D = E = F = 1/9. Now the parameters of the sharpening filter are a = b = c = d = e = f = -0.1 and the numeric values, arbitrarily scaled to make them simple integers, are:
. . -1 . . . . -1 +10 +9 . ------------ . / -90 -101 \ +20 -1 / \ / \ < -101 +999 -90 > +10 -1 \ / \ / . \ -100 -90 / +20 . ------------ . +10 . . . .
I've taken the liberty of adding lines to make the familiar 7-pixel ``center-surround'' shape clearly visible. Outside that kernel, ten tiny-valued taps are provided and chosen specifically to allow the perfect inverse filter to be finite. The perfect reconstruction property of these useful filter banks was derived effortlessly by the A_1 B_1 A_2 B_2 A_3 cascade.
Here is the blurring filter which is the perfect inverse of the sharpening filter just shown. Remember that all three rotations of these filters must be used, in a 3-ary filterbank.
. . +1 . . . . +1 +9 +10 . ------------ . / +90 +82 \ +18 +1 / \ / \ < +82 +730 +90 > +9 +1 \ / \ / . \ +81 +90 / +18 . ------------ . +9 . . . .