Transform an MTG
A note on anonymous functions
A lot of examples in this tutorial use anonymous functions. These functions are just a way to quickly define a function. For example a function that adds 1 to its input argument would usually be declared as follows:
function plus_one(x)
x + 1
end
Here we have a name for our function: "plus_one". But sometimes we don't need to name our function because its only usage is to be passed to another function. In this case we can declare an anonymous function like so:
x -> x + 1
This is exactly the same function, but without a name.
We use x
here because it is more or less of a standard, but we could use any other argument name. You'll see that we use node
instead when referring to an MTG node (node -> node.var
), and x
when we refer to a node attribute (x -> x + 1
).
Introduction to MTG transforming
MTGs can be very large, and it quickly becomes impossible to manually change the attribute values of the nodes.
Instead, you can compute new attributes for all nodes in an MTG using transform!
.
The syntax of transform!
is very close to the one from DataFrames.jl
. It has several forms that allow to perform computations either on the node or the node attributes directly.
Here is a summary of the different forms you can use:
- a
:var_name => :new_var_name
pair. This form is used to rename an attribute name - a
:var_name => function => :new_var_name
or[:var_name1, :var_name2...] => function => :new_var_name
. The variables are declared as aSymbol
or aString
(or a vector of), and they are passed as positional arguments tofunction
. The new variable name is optional, and is automatically generated if not provided by concatenating the source column name(s) and the function name if any, this form would be used as::var_name => function
. - a
function => :new_var_name
form that applies a function to a node and puts the results in a new attribute. This form is usually applied when searching ancestors or descendants values. - a
function
form that applies a mutating function to a node, without expecting any output. This form is used when using a function that already mutates the node, without the need to return anything, e.g.branching_order!
.
This tutorial is a deep dive into these different forms.
All examples use the mutating version transform!
, but there is a non-mutating version too (transform
). It is used likewise but returns a modified copy of the mtg
, which is a little bit slower.
Form 1: Rename an attribute
Renaming an attribute in an MTG is very simple. It uses the exact same syntax as DataFrames.jl
. First, let's check which attributes are available in the MTG:
get_attributes(mtg)
8-element Vector{Symbol}:
:description
:symbols
:scales
:XEuler
:Length
:Width
:dateDeath
:isAlive
Let's rename :Width
to remove the capital letter and make it all lowercase:
transform!(mtg, :Width => :width)
Let's check if the attribute name changed:
print(get_attributes(mtg))
[:description, :symbols, :scales, :XEuler, :Length, :dateDeath, :isAlive, :width]
Yes it did!
The equivalent call with the non-mutating version of transform is:
new_mtg = transform(mtg, :Width => :width)
print(get_attributes(new_mtg))
[:description, :symbols, :scales, :XEuler, :Length, :dateDeath, :isAlive, :width]
Form 2: Compute new attributes based on other attributes
We can also compute a new attribute based on another one. For example we could need the length in meters instead of centimetres. To do so, we can compute it as follows:
transform!(mtg, :Length => (x -> x / 10) => :length_m, ignore_nothing = true)
The magic happens in the :Length => (x -> x / 10) => :length_m
expression. transform!
takes the :Length
variable as input (LHS, Left-hand side of the expression), and use it as the argument for the anonymous function given in the middle of the expression: x -> x / 10
. Then it puts the output of the function into a new variable named :length_m
(RHS, Right-hand side of the expression)
In fewer words, we divide the :Length
attribute by 10 for every node in the MTG, and put the results in a new attribute called :length_m
.
We use ignore_nothing = true
to tell transform!
not to process the nodes with a value of nothing
for the input variable (:Length
). Otherwise our computation would error because the function we use do not handle nothing
values well: nothing / 10
returns an error.
The anonymous function must be surrounded by parenthesis (like in DataFrames.jl
)
Let's check if we can find :length_m
in the list of our MTG attributes:
print(get_attributes(mtg))
[:description, :symbols, :scales, :XEuler, :Length, :length_m, :dateDeath, :isAlive, :width]
We can also get its values by using descendants
on the root node:
descendants(mtg, :length_m)
6-element Vector{Any}:
nothing
nothing
0.01
0.02
0.01
0.02
We can also get the values in the form of a DataFrame instead:
DataFrame(mtg, :length_m)
Row | tree | id | symbol | scale | index | parent_id | link | length_m |
---|---|---|---|---|---|---|---|---|
String? | Int64? | String? | Int64? | Int64? | Int64? | String? | Float64? | |
1 | / 1: Scene | 1 | Scene | 0 | 0 | missing | / | missing |
2 | └─ / 2: Individual | 2 | Individual | 1 | 0 | 1 | / | missing |
3 | └─ / 3: Axis | 3 | Axis | 2 | 0 | 2 | / | missing |
4 | └─ / 4: Internode | 4 | Internode | 3 | 0 | 3 | / | 0.01 |
5 | ├─ + 5: Leaf | 5 | Leaf | 3 | 0 | 4 | + | 0.02 |
6 | └─ < 6: Internode | 6 | Internode | 3 | 1 | 4 | < | 0.01 |
7 | └─ + 7: Leaf | 7 | Leaf | 3 | 0 | 6 | + | 0.02 |
We can also provide several input variables if we need:
transform!(mtg, [:Length, :width] => ((x,y) -> π * x * y^2) => :volume_cm3, ignore_nothing = true)
Here we provide the input attributes as a Vector of Symbols (could be String also), and given them to an anonymous function that takes two arguments as inputs. Our attributes are given to the anonymous function in order, i.e positional arguments. Then we name our new attribute :volume_cm3
. Again, we use ignore_nothing = true
to remove the nodes with nothing
values for the input attributes :Length
and :width
.
Let's see the results:
DataFrame(mtg, [:Length, :width, :volume_cm3])
Row | tree | id | symbol | scale | index | parent_id | link | Length | width | volume_cm3 |
---|---|---|---|---|---|---|---|---|---|---|
String? | Int64? | String? | Int64? | Int64? | Int64? | String? | Float64? | Float64? | Float64? | |
1 | / 1: Scene | 1 | Scene | 0 | 0 | missing | / | missing | missing | missing |
2 | └─ / 2: Individual | 2 | Individual | 1 | 0 | 1 | / | missing | missing | missing |
3 | └─ / 3: Axis | 3 | Axis | 2 | 0 | 2 | / | missing | missing | missing |
4 | └─ / 4: Internode | 4 | Internode | 3 | 0 | 3 | / | 0.1 | 0.02 | 0.000125664 |
5 | ├─ + 5: Leaf | 5 | Leaf | 3 | 0 | 4 | + | 0.2 | 0.1 | 0.00628319 |
6 | └─ < 6: Internode | 6 | Internode | 3 | 1 | 4 | < | 0.1 | 0.02 | 0.000125664 |
7 | └─ + 7: Leaf | 7 | Leaf | 3 | 0 | 6 | + | 0.2 | 0.1 | 0.00628319 |
The new name of the attribute (the RHS) is optional though. We could write our first example as:
transform!(mtg, :Length => (x -> x / 10), ignore_nothing = true)
In this case the name of the new attribute is automatically computed based on the input variable name and the name of the function. If the function is anonymous, which is the case in our example, it uses the default "function" name instead. Our new variable name is then called :Length_function
.
If we used a function with a name such as log
instead of an anonymous function, the new attribute name would be :Length_log
. Here's an example with the log
function:
transform!(mtg, :Length => log, ignore_nothing = true)
print(get_attributes(mtg))
[:Length_function, :volume_cm3, :description, :symbols, :scales, :XEuler, :Length, :length_m, :Length_log, :dateDeath, :isAlive, :width]
Form 3: Compute a new attribute based on node values
We can compute a new attribute by providing a function directly as the right-hand side instead of an attribute name like so:
transform!(mtg, symbol => :Symbol)
The symbol
function takes a node as its first (and only) argument, and returns its symbol. An alternative way of writing this would be:
transform!(mtg, node -> symbol(node) => :Symbol)
This particularly useful when we need to compute a new attribute based on the values of the node itself.
Here we just copied the MTG symbol onto the attributes of the nodes. In this form, it is mandatory to provide a name for the newly created variable, else the function is considered to not return anything (see next form: Form 4: Apply a function to nodes).
Because this form expects a function that works on nodes directly, it is now possible to use the descendants
and ancestors
functions. For example we can compute the total length of the subtree of each node in an MTG (i.e. the length of all children of a node) as follows:
function get_length_descendants(x)
nodes_lengths = descendants(x, :Length, ignore_nothing = true)
if length(nodes_lengths) == 0
return nothing
else
return sum(nodes_lengths)
end
end
transform!(mtg, get_length_descendants => :length_subtree)
descendants(mtg, :length_subtree)
6-element Vector{Any}:
0.6000000000000001
0.6000000000000001
0.5
nothing
0.2
nothing
This form cannot use ignore_nothing = true
because it does not know which attributes to look for before-hand. You'll have to use the filter_fun
argument or handle nothing
values inside your function instead.
Here we first declared a new function to get the length of all descendants of a node (get_length_descendants
), and then compute the sum only if one or more values for length were found. Then we pass this function to transform!
and define our new attribute name as :length_subtree
. We define the function first for clarity because it needs to handle nothing
values properly before the call to sum
.
An alternative way to write this would be to first get the vector of length for each node, and then to compute the sum like so:
transform!(
mtg,
(node -> descendants(node, :Length, ignore_nothing = true)) => :length_subtree2,
:length_subtree2 => (x -> length(x) == 0 ? nothing : sum(x)) => :length_subtree2
)
Because transform!
computes the expressions sequentially, we can re-use a computation from the last expression. This is exactly what we are doing here. First we get the values of the length of all descendants of each node, and put the result in a new attribute :length_subtree2
. Then we re-use the data from this attribute to compute its sum, but only if the length of the data is not 0
, and put the result back to the same attribute :length_subtree2
.
We can test if both calls returns the same output:
all(descendants(mtg, :length_subtree2) .== descendants(mtg, :length_subtree))
true
Yes they are!
Form 4: Apply a function to nodes
We can also apply a function that performs a computation on the node like Form 3, but does not return a new attribute value. For example it can be useful to use a printing function to help us debug another function call. Here's an example where we want to print the id of the nodes that are leaf nodes:
transform!(mtg, node -> isleaf(node) ? println(node_id(node)," is a leaf") : nothing)
5 is a leaf
7 is a leaf
We can also use this form to mutate the MTG of a node (which is not possible with Form 2). Here's an example where we change the "Internode" symbol into "I":
transform!(mtg, node -> symbol!(node, "I"), symbol = "Internode")
mtg
/ 1: Scene
└─ / 2: Individual
└─ / 3: Axis
└─ / 4: I
├─ + 5: Leaf
└─ < 6: I
└─ + 7: Leaf
If you change the values of the MTG field of the nodes, you can update the header of the MTG stored in the root node. For example here we updated the symbols, so we should do:
mtg[:symbols] = get_classes(mtg).SYMBOL
mtg[:description] = get_description(mtg)
Note that it is not important for writing back to disc as they are automatically updated anyway.
Select an MTG
As in DataFrames, MultiScaleTreeGraph.jl
provides a select!
function for deleting all attributes not explicitly provided as arguments to the selection. The selection can also apply transformations on the fly following the same format used in transform!
, with one more Form though: just the name of the variable to select.
For example we can compute the new length in meters, and keep only this result along with the width as follows:
mtg_select = deepcopy(mtg)
select!(mtg_select, :Length => (x -> x / 10) => :length_m, :Width, ignore_nothing = true)
DataFrame(mtg_select)
Row | tree | id | symbol | scale | index | parent_id | link | length_m |
---|---|---|---|---|---|---|---|---|
String? | Int64? | String? | Int64? | Int64? | Int64? | String? | Float64? | |
1 | / 1: Scene | 1 | Scene | 0 | 0 | missing | / | missing |
2 | └─ / 2: Individual | 2 | Individual | 1 | 0 | 1 | / | missing |
3 | └─ / 3: Axis | 3 | Axis | 2 | 0 | 2 | / | missing |
4 | └─ / 4: I | 4 | I | 3 | 0 | 3 | / | 0.01 |
5 | ├─ + 5: Leaf | 5 | Leaf | 3 | 0 | 4 | + | 0.02 |
6 | └─ < 6: I | 6 | I | 3 | 1 | 4 | < | 0.01 |
7 | └─ + 7: Leaf | 7 | Leaf | 3 | 0 | 6 | + | 0.02 |
There is also a non-mutating version of the function:
mtg_select = select(mtg, :Length => (x -> x / 10) => :length_m, :Width, ignore_nothing = true)
DataFrame(mtg_select)
Row | tree | id | symbol | scale | index | parent_id | link | length_m |
---|---|---|---|---|---|---|---|---|
String? | Int64? | String? | Int64? | Int64? | Int64? | String? | Float64? | |
1 | / 1: Scene | 1 | Scene | 0 | 0 | missing | / | missing |
2 | └─ / 2: Individual | 2 | Individual | 1 | 0 | 1 | / | missing |
3 | └─ / 3: Axis | 3 | Axis | 2 | 0 | 2 | / | missing |
4 | └─ / 4: I | 4 | I | 3 | 0 | 3 | / | 0.01 |
5 | ├─ + 5: Leaf | 5 | Leaf | 3 | 0 | 4 | + | 0.02 |
6 | └─ < 6: I | 6 | I | 3 | 1 | 4 | < | 0.01 |
7 | └─ + 7: Leaf | 7 | Leaf | 3 | 0 | 6 | + | 0.02 |
Traverse an MTG
transform!
and select!
use traverse!
under the hood to apply a function call to each node of an MTG. traverse!
is just a little bit less easy to use as it only accepts Form 4. We can obtain the exact same results as the last example of transform!
using the same call with traverse!
. Let's change the Leaf
symbol into L
:
traverse!(mtg, node -> symbol!(node, "L"), symbol = "Leaf")
mtg
/ 1: Scene
└─ / 2: Individual
└─ / 3: Axis
└─ / 4: I
├─ + 5: L
└─ < 6: I
└─ + 7: L
A benefit of traverse!
is it can be used with a do...end
block notation for complex sets of instructions:
traverse!(mtg) do node
if isleaf(node)
println(node_id(node)," is a leaf")
end
end
5 is a leaf
7 is a leaf
Mutate an MTG
For users coming from R, we also provide the @mutate_mtg!
macro that is similar to transform!
but uses a more tidyverse
-alike syntax. All values coming from the MTG node must be preceded by a node.
, as with the .data$
in the tidyverse
. The names of the attributes are shortened to just node.attr_name
instead of node_attributes(node).attr_name
though. Here's an example usage:
@mutate_mtg!(mtg, volume = π * 2 * node.Length, symbol = "I")
We see that we first name the new attribute and assign the result of the computation. Constants are provided as is, and values coming from the nodes are prefixes by node.
.
Helpers
You can use helper functions provided by MultiScaleTreeGraph.jl
for:
- Filtering nodes:
isroot
,isleaf
- Compute the number of leaf nodes in the subtree of a node:
nleaves
- Apply the
pipe_model!
to the MTG to compute the cross-section of all nodes based on an initial cross-section.
The pipe model is used in plant physiology (especially on trees) and is built around the coarse hypothesis that each leaf in a plant is (to some extent) connected to the roots via a "pipe" of constant cross-sectional area. The concepts of the pipe model are detailed in Lehnebach et al. (2018).
This package provides an implementation of the pipe model, used as follows:
first_cross_section = 0.34 # the initial cross-section of the plant
transform!(mtg, (node -> pipe_model!(node, first_cross_section)) => :cross_section_pipe)
DataFrame(mtg, :cross_section_pipe)
Row | tree | id | symbol | scale | index | parent_id | link | cross_section_pipe |
---|---|---|---|---|---|---|---|---|
String? | Int64? | String? | Int64? | Int64? | Int64? | String? | Float64? | |
1 | / 1: Scene | 1 | Scene | 0 | 0 | missing | / | 0.34 |
2 | └─ / 2: Individual | 2 | Individual | 1 | 0 | 1 | / | 0.34 |
3 | └─ / 3: Axis | 3 | Axis | 2 | 0 | 2 | / | 0.34 |
4 | └─ / 4: I | 4 | I | 3 | 0 | 3 | / | 0.34 |
5 | ├─ + 5: L | 5 | L | 3 | 0 | 4 | + | 0.113333 |
6 | └─ < 6: I | 6 | I | 3 | 1 | 4 | < | 0.226667 |
7 | └─ + 7: L | 7 | L | 3 | 0 | 6 | + | 0.226667 |
For more information about the implementation, you can check the documentation of the function: pipe_model!
.
References
R. Lehnebach, R. Beyer, V. Letort, et P. Heuret, « The pipe model theory half a century on: a review », Annals of Botany, vol. 121, nᵒ 5, p. 773‑795, avr. 2018, doi: 10.1093/aob/mcx194.