Zippering k-ary Ordered Trees in Neith

January 24, 2014

Using Neith to zipper k-ary trees with labeled edges is a common enough use case as to demand special attention. The core Neith zipper module is too generic; zippered tree data structures benefit from specialized constructors and operations.

Following up on my overview of Neith’s Javascript zippers, here I cover the definition of zippers and zipper operations for generic ordered k-ary trees with labeled edges.

Goals

Support variations of k-ary ordered trees with labeled edges.
Provide a basic set of generic labeled tree query, movement, and editing operations.
Allow editing nodes and edges independently.
Use specialized zipper definition functions for children and constructNode that are better suited to trees.
Support lazily generated, potentially infinite trees.

Problem With Standard Neith Zippers

The core Neith zipper has no concept of edges, but as the full binary tree zipper previous presented demonstrates, it is still fully possible to define and use zippers on trees without considering edges:

var Binary = declare(null, ['value', 'left', 'right']);

binaryZipper = let
    children = \node#{left right} ->
        stream.from([left, right]),
    
    construct = \node#{value} children ->
        new Binary(
            value,
            first(children),
            first(rest(children)))
in
    (zipper,
        children,
        construct);

Movement along edges then requires fragile and specialized operations.

// These will not work if the tree is not full since right
// could be at index 0 or 1 and this will not work
// on a ternary tree or other tree structures.
var binaryMoveToLeft = down;
var binaryMoveToRight = down \> right;

Many operations, such as renaming or reordering edges, are completely impossible using this hardcoded approach.

Generic Zipper for k-ary Trees With Labeled Edges

A ordered labeled tree zipper can be expressed as an ordered unlabeled tree zipper where each element is an edge, node pair.

var Pair = \key value -> ({'key': key, 'value': value});

var key = \{key} -> key;
var value = \{value} -> value;

Labeled tree zippers operate on these pairs, with the actual storage as pairs mostly hidden from the library consumer.

Defining a Zipper for Labeled Trees

As previously covered, Neith zippers are defined by two metadata functions:

children - Get the child elements of a node.
constructNode - Reconstruct a node from its children.

Pair will be used to zipper the tree internally, but the public API should not know about this implementation detail. Intermediate metadata functions will map back and forth between Pair and nodes as needed.

This is also an opportunity to provide a zipper definition interface more suited to trees. Tree zippers are defined by three functions:

edges - Map a node to a Nu lazy stream of edge labels.
getChild - Map a node and edge label to a child node.
constructNode - Take a node, stream of edge node pairs, and function returning a Javascript map/object of edges to child nodes and returns a reconstructed node. The map is only for connivence and is not required .

The tree zipper module expresses the core children and constructNode functions using the tree specific edge, getChild, and constructNode functions while also mapping to and from Pair.

/// Create a tree zipper
treeZipper =\edges getChild constructNode root -> let
    // `children` takes a `Pair` and returns
    // a lazy stream of `Pair` for the children
    // generated by calling `edges` and `getChild`.
    children = \element#{value} ->
        map(
            \edge -> Pair(edge, getChild(value, edge)),
            edges(value)),
    
    // `_constructNode` takes a `Pair` and
    // stream of `Pair` for the children and returns
    // a `Pair` with `element` reconstructed.
    // The tree specific `constructNode` handles
    // Actual reconstruction.
    _constructNode = \element, children ->
        Pair(
            key(element),
            constructNode(
                value(parent),
                children,
                \() -> foldl(reducer, {}, children)))
in
    // Standard neith zipper.
    // Focus is Pair with empty edge value.
    zipper.zipper(
        children,
        _constructNode,
        Pair(null, root));

Better Binary Tree Zipper

The binary tree zipper can be rewritten to take advantage of treeZipper:

var Binary = declare(null, ['value', 'left', 'right']);
Binary.edges = ['left', right];

binaryZipper = let
    children = \element ->
        stream.from<|
            Object.keys(element)
                .filter(\x -> Binary.edges.indexOf(x) !== -1),
    
    getChild = \node edge -> node[edge],
    
    construct = \x _ childrenGenerator -> let
        children = childrenGenerator()
    in
        Binary.create(x.value,
            children.left,
            children.right)
in
    (treeZipper,
        children,
        getChild,
        construct);

k-ary Tree with Configurable Edges Zipper

A k-ary tree with arbitrary string edges can also be defined:

var Kary = declare(null, [
    'value',
    'map' // JS object map of edges to children
]);

karyZipper = (treeZipper,
    \x -> Object.keys(x.map),
    \x k -> x.map[k],
    \x _ values ->
        new Kary(x.value, values()));

Tree Specific Operations

Existing Neith operations work fine for our new tree zippers, but return Pairs instead of nodes. Working directly with Pair is difficult and error prone, so a new set of operations will hide Pair by operating on edges and nodes separately.

Queries

Tree queries extract either the key or value from the core operations.

/// Get node at focus
node = zipper.extract \> value;

/// Get edge leading to focus
edge = zipper.extract \> key;

/// Get stream path of nodes leading from the focus 
nodePath = zipper.path \> (map, value);

/// Get stream path of edges leading from the focus
edgePath = zipper.path \> (map, key);

/// Get the parent node
parentNode = zipper.parent \> value;

Labeled Movement

The code movement operations also work fine with tree zippers, but are cumbersome and fragile. Labeled tree specific movements allow moving along edges, abstracting away from the underlying order of nodes.

/// Move to the nth child of the focus
nthChild = let
    goRight = \ctx count ->
        (count <= 0 ?
            ctx :
            goRight(right(ctx), count - 1))
in
    \index ctx ->
        let child = down(ctx) in
            (child ?
                goRight(child, index) :
                null);

/// Move along `edge` to a child.
child = let
    /// Get first index of `e` in stream `s`
    indexOf = \e s ->
        foldl(
            \p [i c] -> (p >= 0 ? p : (c === e ? i : p)),
            -1,
            indexed <| s)
in
    \edge ctx ->
        (!ctx ? null :
            let
                children = zipper.children(ctx),
                index = indexOf(edge, map(key, children))
            in
                (index === -1 ? null : nthChild(index, ctx)));

/// Move to the sibling with `edge` leading from parent.
sibling = \edge ctx ->
    (ctx
        |> up
        |> (child, edge));

Using labels to move along the tree:

static console;
var print = \f ctx -> { console.log(f(ctx)); return ctx; };
var karyValue = \x -> x.value;

var $ = Kary.create;
var kz = karyZipper <| $('a', {
    '2': $('b', {
        '3': $('c', {}),
        '4': $('d', {
            '5': $('e', {}),
        }),
    }),
    '6': $('f', {
        '7': $n('g', {}),
    })
});
    
kz
    |> (child, '2')
    |> (child, '4')
    |> (print, node \> karyValue) // 'd'
    |> (print, edgePath \> toArray) // ['4', '2', null]
    |> (print, nodePath \> (map, karyValue) \> toArray) // ['d', 'b', 'a']
    |> (sibling, '3')
    |> (print, node \> karyValue); // 'c'
}

Editing

Editing operations on tree zippers also use edges and nodes instead of Pair:

/// Set node, keep edge.
setNode = \node ctx ->
    zipper.replace(
        Pair(edge(ctx), node),
        ctx);

/// Modify node, keep edge.
modifyNode = \f ctx ->
    setNode(f(node(ctx)), ctx);

/// Set edge, keep node.
setEdge = \edge ctx ->
    zipper.replace(
        Pair(edge, node(ctx)),
        ctx);
        
/// Modify edge, keep node.
modifyEdge = \f ctx ->
    setEdge(f(edge(ctx)), ctx);

var kz1 = kz
    |> (child, '2')
    |> (child, '4')
    |> (modifyNode, \x -> x.charCodeAt(0)) // 100
    |> (sibling, '3')
    |> (setEdge, "new_edge");
    |> root |> extract;
    
/*
kz1 is:
$('a', {
    '2': $('b', {
        'new_edge": $('c', {}),
        '4': $('100', {
            '5': $('e', {}),
        }),
    }),
    '6': $('f', {
        '7': $n('g', {}),
    })
});
*/

All the insertion operations take both an edge and a node:

/// Insert a sibling to the left of the focus.
insertLeft = \edge node ctx ->
    zipper.insertLeft(
        Pair(edge, node),
        ctx);

/// Insert a sibling to the right of the focus.
insertRight = \edge node ctx ->
    zipper.insertRight(
        Pair(edge, node),
        ctx);

/// Insert a child at the left for the focus.
insertChild = \edge node ctx ->
    zipper.insertChild(
        Pair(edge, node),
        ctx);

/// Insert a child at the right for the focus. 
appendChild = \edge node ctx ->
    zipper.appendChild(
        Pair(edge, node),
        ctx);

Applications

ECMA AST Zipper

I use the tree zipper module to define an ECMAScript AST zipper. The AST nodes store additional metadata on valid edges and attributes that makes this possible. Some nodes in the AST are regular arrays that must be handled specially.

Code from [ecma-ast-zipper][ecma-ast-ziper]:

package (
    ecmaZipper)
with
    import 'nu-stream::stream' {foldl from NIL},
    import 'nu-stream::gen' {range},
    import 'neith::tree' {treeZipper},
    import 'ecma-ast::node' {Node modify}
in {

var buildArray = \pairs ->
    foldl(
    \p, {key value} -> {
        p[key] = value;
        return p;
    },
    [], pairs);
        
ecmaZipper = (treeZipper,
    \node ->
        ?node
            :?Array.isArray(node)
                :range(0, node.length)
                :from(node.children))
            :NIL,
    
    \n k -> n[k],
    
    \node pairs values ->
        ?node instanceof Node
            :modify(node, values(), {})
            :buildArray(pairs));

}

Other Notes and Caveats

Laziness and Infinite Trees

Node and edges are generated on demand by edges (which returns a stream), and getChild. This allows working with trees with an infinite number of immediate children and infinite depth. Tree structures like nodes do not even need to exist as objects, allowing more interesting structures to be zippered:

/// Infinite binary tree where each node stores its
/// index in a level-order traversal.
/// For example of infinite trees only, does not support editing.
var CountBinaryTree = treeZipper(
    \element -> stream.from<|['left', 'right'],
    getChild = \x edge -> (2 * x) + (edge === 'left' ? 1 : 2),
    \x _ _ -> x,
    0);
    
CountBinaryTree
    |> down
    |> down
    |> right
    |> down
    |> node; // 9

No Validity Enforcement

As designed, Neith performs no checks and will happily mutilate trees. Zipper operations themselves do not enforce anything, including edge uniqueness or the valid values for edges.

These operations on a binary tree are perfectly valid but create a meaningless structure:

var $ = Binary.create;
var bz = binaryZipper <| $(1,
    $(2,
        $(3, null, null),
        $(4,
            $(5, null, null))),
    $(6,
        null,
        $(7,
            $(8, null, null),
            null)));
            
bz
    |> (insertChild, 'left', 100)
    |> (insertChild, 'left', 100)
    |> (insertChild, null, undefined)
    |> zipper.children |> toArray;
/* output: [
    Pair(null, undefined),
    Pair('left', 100),
    Pair('left', 100),
    Pair('left', 2node),
    Pair('right', 6node)] */

Nodes can be restricted to only use specific edge values after the fact with the reconstruct function. binaryZipper reconstruct only will take values from left and right, no matter what other edges exist. This will not prevent situations like shown above however.

If you are concerned that an operation could leave the data structure in a weird state, you can also define your own operations that must satisfying certain assertions before executing:

/// Insert left if no edge already exists, otherwise replace node for edge
var safeInsertLeft = \edge node ctx ->
    let current = child(edge, ctx) in
        ?current
            :(current
                |> (setNode, node)
                |> up)
            :insertLeft(edge, node, ctx);