Package 'netcom'

Title: NETwork COMparison Inference
Description: Infer system functioning with empirical NETwork COMparisons. These methods are part of a growing paradigm in network science that uses relative comparisons of networks to infer mechanistic classifications and predict systemic interventions. They have been developed and applied in Langendorf and Burgess (2021) <doi:10.1038/s41598-021-99251-7>, Langendorf (2020) <doi:10.1201/9781351190831-6>, and Langendorf and Goldberg (2019) <arXiv:1912.12551>.
Authors: Ryan Langendorf [aut, cre], Debra Goldberg [ctb], Matthew Burgess [ctb]
Maintainer: Ryan Langendorf <[email protected]>
License: GPL-3
Version: 2.1.6
Built: 2024-11-15 04:38:06 UTC
Source: https://github.com/langendorfr/netcom

Help Index


Network Alignment

Description

Network alignment by comparing the entropies of diffusion kernels simulated on two networks. align takes two networks stored as matrices and returns a node-level alignment between them.

Usage

align(
  network_1_input,
  network_2_input,
  base = 2,
  max_duration,
  characterization = "entropy",
  normalization = FALSE,
  unit_test = FALSE
)

Arguments

network_1_input

The first network being aligned, which must be in matrix form. If the two networks are of different sizes, it will be easier to interpret the output if this is the smaller one.

network_2_input

The second network, which also must be a matrix.

base

Defaults to 1. The base in the series of time steps to sample the diffusion kernels at. If base = 1 every time step is sampled. If base = 2, only time steps that are powers of 2 are sampled, etc. Larger values place more emphasis on earlier time steps. This can be helpful if the diffusion kernel quickly converges to an equilibrium, and also runs faster.

max_duration

Defaults to twice the diameter of the larger network. Sets the number of time steps to allow the diffusion kernel to spread for, which is the smallest power of base that is at least as large as max_duration.

characterization

Defaults to "entropy". Determines how the diffusion kernels are characterized. Either "entropy" or "gini". "entropy" is a size-normalized version of Shannon's entropy with base e (Euler's number). This is also known as interaction or species evenness in ecology. "gini" is the Gini coefficient.

normalization

Defaults to FALSE. Determines if self-loops should be augmented such that edge weights are proportional to those in network_1_input and network_2_input. FALSE by default because this is inappropriate for unweighted binary/logical networks where edges indicate only the presence of an interaction.

unit_test

Defaults to FALSE. Saves the following intermediate steps to help with general troubleshooting: post-processing matrix representations of both networks, time steps at which the diffusion kernels were sampled, the diffusion kernels at those time steps, the characterizations of the diffusion kernels at those time steps, and the cost matrix fed into the Hungarian algorithm where the ij element is the difference between the characterization-over-time curves for node i in the first network and node j in the second network.

Details

Network alignment pairs nodes between two networks so as to maximize similarities in their edge structures. This allows information from well-studied systems to be used in poorly studied ones, such as to identify unknown protein functions or ecosystems that will respond similarly to a given disturbance. Most network alignment algorithms focus on measures of topological overlap between edges of the two networks. The method implemented here compares nodes using the predictability of dynamics originating from each node in each network. Consider network alignment as trying to compare two hypothetical cities of houses connected by roads. The approach implemented here is to pairwise compare each house with those in the other city by creating a house-specific signature. This is accomplished by quantifying the predictability of the location of a person at various times after they left their house, assuming they were moving randomly. This predictability across all houses captures much of the way each city is organized and functions. align uses this conceptual rationale to align two networks, with nodes as houses, edges as roads, and random diffusion representing people leaving their houses and walking around the city to other houses. The mechanics of this, which are conceptually akin to flow algorithms and Laplacian dynamics, can be analytically expressed as a Markov chain raised to successive powers which are the durations of diffusion.

Note that the novel part of align lies in creating a matrix where the ij entry is a measure of similarity between node i in the first network and node j in the second. The final alignment is found using solve_LSAP in the package clue, which uses the Hungarian algorithm to solve the assignment problem optimally.

Value

score

Mean of all alignment scores between nodes in both original networks network_1_input and network_2_input.

alignment

Data frame of the nodes in both networks, sorted numerically by the first network (why it helps to make the smaller network the first one), and the corresponding alignment score.

score_with_padding

Same as score but includes the padding nodes in the smaller network, which can be thought of as a size gap penalty for aligning differently sized networks. Only included if the input networks are different sizes.

alignment_with_padding

Same as alignment but includes the padding nodes in the smaller network. Only included if the input networks are different sizes.

References

Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics (NRL), 2(1-2), 83-97.

Langendorf, R. E., & Goldberg, D. S. (2019). Aligning statistical dynamics captures biological network functioning. arXiv preprint arXiv:1912.12551.

C. Papadimitriou and K. Steiglitz (1982), Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs: Prentice Hall.

Examples

# The two networks to be aligned
net_one <- matrix(stats::runif(25,0,1), nrow=5, ncol=5)
net_two <- matrix(stats::runif(25,0,1), nrow=5, ncol=5)

align(net_one, net_two)
align(net_one, net_two, base = 1, characterization = "gini", normalization = TRUE)

Empirical parameterization

Description

Helper function to find the best fitting version of a mechanism by searching across its parameter space

Usage

best_fit_optim(
  parameter,
  process,
  network,
  net_size,
  net_kind,
  mechanism_kind,
  resolution,
  resolution_min,
  resolution_max,
  reps,
  power_max,
  connectance_max,
  divergence_max,
  mutation_max,
  cores,
  directed,
  method,
  cause_orientation,
  DD_kind,
  DD_weight,
  max_norm,
  best_fit_kind = "avg",
  verbose = FALSE
)

Arguments

parameter

The parameter being tested for its ability to generate networks alike the input 'network'.

process

Name of mechanism. Currently only "ER", "PA", "DD", "DM" "SW", and "NM" are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. DM = Duplication and Mutation. SW = Small World. NM = Niche Model.

network

The network being compared to a hypothesized 'process' with a given 'parameter' value.

net_size

Number of nodes in the network.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list").

mechanism_kind

Either "canonical" or "grow" can be used to simulate networks. If "grow" is used, note that here it will only simulate pure mixtures made of a single mechanism.

resolution

The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy.

resolution_min

= The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

reps

The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

power_max

The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

The maximum connectance parameter for the Niche Model.

divergence_max

The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

The maximum mutation parameter for the Duplication and Mutation mechanism.

cores

The number of cores to run the classification on. When set to 1 parallelization will be ignored.

directed

Whether the target network is directed.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods.

cause_orientation

The orientation of directed adjacency matrices.

DD_kind

A vector of network properties to be used to compare networks.

DD_weight

Weights of each network property in DD_kind. Defaults to 1, which is equal weighting for each property.

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one.

best_fit_kind

How to aggregate the stochastic replicates of the process + parameter combination.

verbose

Defaults to TRUE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A number measuring how different the input network is from the parameter + process combination.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)

# Calculate how similar the input network is to Small-World networks with 
# a rewiring probability of 0.28.
best_fit_optim(
     parameter = 0.28, 
     process = "SW", 
     network = network, 
     net_size = 12, 
     net_kind = "matrix", 
     mechanism_kind = "grow", 
     resolution = 100, 
     resolution_min = 0.01, 
     resolution_max = 0.99, 
     reps = 3, 
     power_max = 5, 
     connectance_max = 0.5, 
     divergence_max = 0.5, 
     mutation_max = 0.5, 
     cores = 1, 
     directed = TRUE, 
     method = "DD", 
     cause_orientation = "row", 
     DD_kind = c(
         "in", "out", "entropy_in", "entropy_out", 
         "clustering_coefficient", "page_rank", "communities"
     ), 
     DD_weight = 1, 
     max_norm = FALSE,
     verbose = FALSE
)

Mechanistic Network Classification

Description

Tests a network against hypothetical generating processes using a comparative network inference.

Usage

classify(
  network,
  directed,
  method = "DD",
  net_kind = "matrix",
  mechanism_kind = "canonical",
  DD_kind = c("in", "out", "entropy_in", "entropy_out", "clustering_coefficient",
    "page_rank", "communities", "motifs_3", "motifs_4", "eq_in", "eq_out",
    "eq_entropy_in", "eq_entropy_out", "eq_clustering_coefficient", "eq_page_rank",
    "eq_communities", "eq_motifs_3", "eq_motifs_4"),
  DD_weight = c(0.0735367966, 0.0739940162, 0.0714523761, 0.0708156931, 0.0601296752,
    0.0448072016, 0.0249793608, 0.0733125084, 0.0697029389, 0.0504358835, 0.0004016029,
    0.0563752664, 0.0561878218, 0.0540490099, 0.0504347104, 0.0558106667, 0.0568270319,
    0.0567474398),
  cause_orientation = "row",
  max_norm = FALSE,
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  test = "empirical",
  best_fit_finder = "systematic",
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  null_reps = 50,
  best_fit_kind = "avg",
  best_fit_sd = 0,
  ks_dither = 0,
  ks_alternative = "two.sided",
  cores = 1,
  size_different = FALSE,
  null_dist_trim = 1,
  verbose = FALSE
)

Arguments

network

The network to be classified.

directed

Whether the target network is directed. If missing this will be inferred by the symmetry of the input network.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods. Defaults to "DD".

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

mechanism_kind

Either "canonical" or "grow" can be used to simulate networks. If "grow" is used, note that here it will only simulate pure mixtures made of a single mechanism. Defaults to "canonical".

DD_kind

= A vector of network properties to be used to compare networks. Defaults to "all", which is the average of the in- and out-degrees.

DD_weight

= Weights of each network property in DD_kind. Defaults to 1, which is equal weighting for each property.

cause_orientation

= The orientation of directed adjacency matrices. Defaults to "row".

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

resolution

Defaults to 100. The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy.

resolution_min

Defaults to 0.01. The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

resolution_max

Defaults to 0.99. The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

test

Defaults to "empirical". The test used to distinguish the null distribution of comparisons between the network being classified and the networks simulated according to a hypothesized mechanism(s), with a particular best-fitting parameter. "empirical" finds how many simulated networks were on average farther from each other than the network being classified is. "KS" uses a KS test. "WMWU" uses a Wilcoxon-Mann-Whitney-U test.

best_fit_finder

Defaults to "systematic". Determines how the best-fitting parameter of each mechanism specified in processes is found. "systematic" tries every parameter value from resolution_min to resolution_max with a step size of resolution_max - resolution_min / resolution. "optim_L-BFGS-B" uses the L-BFGS-B optimizer in the optimx package. "optim_GenSA" uses the GenSA optimizer in the GenSA package.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

= Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

= Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

= Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

null_reps

Defaults to 50. The number of best fit networks to simulate that will be used to create a null distribution of distances between networks within the given process, which will then be used to test if the target network appears unusually distant from them and therefore likely not governed by that process.

best_fit_kind

Defaults to "avg". If null_reps is more than 1, the fit of each parameter has to be an aggregate statistic of the fit of all the null_reps networks. Must be 'avg', 'median', 'min', or 'max'.

best_fit_sd

Defaults to 0. Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter can artificially inflate the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but can be computationally intractable for realistically large systems.

ks_dither

Defaults to 0. The KS test cannot compute exact p-values when every pairwise network distance is not unique. Adding small amounts of noise makes each distance unique. We are not aware of a study on the impacts this has on accuracy so it is set to zero by default.

ks_alternative

Defaults to "two.sided". Governs the KS test. Assuming best_fit_sd is not too large, this can be set to "greater" because the target network cannot be more alike identically simulated networks than they are to each other. In practice we have found "greater" and "less" produce numerical errors. Only "two.sided", "less", and "greater" are supported through stats::ks.test().

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

size_different

= If there is a difference in the size of the networks used in the null distribution. Defaults to FALSE.

null_dist_trim

= Number between zero and one that determines how much of each network comparison distribution (unknown network compared to simulated networks, simulated networks compared to each other) should be used. Prevents p-value convergence with large sample sizes. Defaults to 1, which means all comparisons are used (no trimming).

verbose

Defaults to FALSE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A dataframe with 3 columns and as many rows as processes being tested (5 by default). The first column lists the processes. The second lists the p-value on the null hypothesis that the target network did come from that row's process. The third column gives the estimated parameter for that particular process.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)

# Classify this network
# This can take several minutes to run
classify(network, processes = c("ER", "PA", "DM", "SW", "NM"))

Mechanistic Network Classification

Description

Tests a network against hypothetical generating processes using a comparative network inference.

Usage

classify_Systematic(
  network,
  directed = FALSE,
  method = "DD",
  net_kind = "matrix",
  DD_kind = c("in", "out", "entropy_in", "entropy_out", "clustering_coefficient",
    "page_rank", "communities", "motifs_3", "motifs_4", "eq_in", "eq_out",
    "eq_entropy_in", "eq_entropy_out", "eq_clustering_coefficient", "eq_page_rank",
    "eq_communities", "eq_motifs_3", "eq_motifs_4"),
  DD_weight = c(0.0735367966, 0.0739940162, 0.0714523761, 0.0708156931, 0.0601296752,
    0.0448072016, 0.0249793608, 0.0733125084, 0.0697029389, 0.0504358835, 0.0004016029,
    0.0563752664, 0.0561878218, 0.0540490099, 0.0504347104, 0.0558106667, 0.0568270319,
    0.0567474398),
  cause_orientation = "row",
  max_norm = FALSE,
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  null_reps = 50,
  best_fit_kind = "avg",
  best_fit_sd = 0.01,
  ks_dither = 0,
  ks_alternative = "two.sided",
  cores = 1,
  size_different = FALSE,
  null_dist_trim = 1,
  verbose = TRUE
)

Arguments

network

The network to be classified.

directed

Defaults to TRUE. Whether the target network is directed.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods. Defaults to "DD".

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

DD_kind

= A vector of network properties to be used to compare networks. Defaults to "all", which is the average of the in- and out-degrees.

DD_weight

= Weights of each network property in DD_kind. Defaults to 1, which is equal weighting for each property.

cause_orientation

= The orientation of directed adjacency matrices. Defaults to "row".

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

resolution

Defaults to 100. The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy.

resolution_min

Defaults to 0.01. The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

resolution_max

Defaults to 0.99. The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

= Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

= Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

= Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

null_reps

Defaults to 50. The number of best fit networks to simulate that will be used to create a null distribution of distances between networks within the given process, which will then be used to test if the target network appears unusually distant from them and therefore likely not governed by that process.

best_fit_kind

Defaults to "avg". If null_reps is more than 1, the fit of each parameter has to be an aggregate statistic of the fit of all the null_reps networks. Must be 'avg', 'median', 'min', or 'max'.

best_fit_sd

Defaults to 0.01. Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter artificially inflates the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but also computationally intractable for realistically large systems.

ks_dither

Defaults to 0. The KS test cannot compute exact p-values when every pairwise network distance is not unique. Adding small amounts of noise makes each distance unique. We are not aware of a study on the impacts this has on accuracy so it is set to zero by default.

ks_alternative

Defaults to "two.sided". Governs the KS test. Assuming best_fit_sd is not too large, this can be set to "greater" because the target network cannot be more alike identically simulated networks than they are to each other. In practice we have found "greater" and "less" produce numerical errors. Only "two.sided", "less", and "greater" are supported through stats::ks.test().

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

size_different

= If there is a difference in the size of the networks used in the null distribution. Defaults to FALSE.

null_dist_trim

= Number between zero and one that determines how much of each network comparison distribution (unknown network compared to simulated networks, simulated networks compared to each other) should be used. Prevents p-value convergence with large sample sizes. Defaults to 1, which means all comparisons are used (no trimming).

verbose

Defaults to TRUE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A dataframe with 3 columns and as many rows as processes being tested (5 by default). The first column lists the processes. The second lists the p-value on the null hypothesis that the target network did come from that row's process. The third column gives the estimated parameter for that particular process.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)

# Classify this network
# This can take several minutes to run
classify(network, processes = c("ER", "PA", "DM", "SW", "NM"))

Compare Networks Many-to-Many

Description

Compares one network to a list of many networks.

Usage

compare(
  networks,
  net_kind = "matrix",
  method = "DD",
  cause_orientation = "row",
  DD_kind = "all",
  DD_weight = 1,
  max_norm = FALSE,
  size_different = FALSE,
  cores = 1,
  diffusion_sampling = 2,
  diffusion_limit = 10,
  verbose = FALSE
)

Arguments

networks

The networks being compared to the target network

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods. Defaults to "DD".

cause_orientation

= The orientation of directed adjacency matrices. Defaults to "row".

DD_kind

= A vector of network properties to be used to compare networks. Defaults to "all", which is the average of the in- and out-degrees.

DD_weight

= Weights of each network property in DD_kind. Defaults to 1, which is equal weighting for each property.

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

size_different

Defaults to FALSE. If TRUE, will ensure the node-level properties being compared are vectors of the same length, which is accomplished using splines.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

diffusion_sampling

Base of the power to use to nonlinearly sample the diffusion kernels if method = "align". Defaults to 2.

diffusion_limit

Number of markov steps in the diffusion kernels if method = "align". Defaults to 10.

verbose

Defaults to TRUE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A square matrix with dimensions equal to the number of networks being compared, where the ij element is the comparison of networks i and j.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
comparisons <- 50
networks <- list()
for (net in 1:comparisons) {
     networks[[net]] = matrix(
         sample(
             c(0,1), 
             size = size^2, 
             replace = TRUE), 
         nrow = size,
         ncol = size)
}
compare(networks = networks)

Compare Networks One-to-Many

Description

Compares one network to a list of many networks.

Usage

compare_Target(
  target,
  networks,
  net_size,
  net_kind = "matrix",
  method = "DD",
  cause_orientation = "row",
  DD_kind = "all",
  DD_weight = 1,
  max_norm = FALSE,
  cores = 1,
  verbose = FALSE
)

Arguments

target

The network be compared.

networks

The networks being compared to the target network

net_size

Size

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods. Defaults to "DD".

cause_orientation

= The orientation of directed adjacency matrices. Defaults to "row".

DD_kind

= A vector of network properties to be used to compare networks. Defaults to "all", which is the average of the in- and out-degrees.

DD_weight

= Weights of each network property in DD_kind. Defaults to 1, which is equal weighting for each property.

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

verbose

Defaults to TRUE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A pseudo-distance vector where the i-element is the comparison between the target network and the ith network being compared to.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
comparisons <- 50
network_target <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
network_others <- list()
for (net in 1:comparisons) {
     network_others[[net]] = matrix(
         sample(
             c(0,1),
             size = size^2,
             replace = TRUE),
         nrow = size,
         ncol = size)
}
compare_Target(target = network_target, networks = network_others, net_size = size, method = "DD")

Gini coefficient

Description

Takes a matrix and returns the Gini coefficient of each column.

Usage

gini(input, byrow = FALSE)

Arguments

input

A matrix where the Gini coefficient will be calculated on each column. Note that vector data must be converted to a single-column matrix.

byrow

Defaults to FALSE. Set to TRUE to calculate the Gini coefficient of each row.

Value

A vector of the Gini coefficients of each column.

References

Gini, C. (1912). Variabilita e mutabilita. Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Rome: Libreria Eredi Virgilio Veschi.

Examples

# Vectors are not supported. First convert to a single-column matrix.
sample_data <- runif(20, 0, 1)
gini(matrix(sample_data, ncol = 1))

# Multiple Gini coefficients can be calculated simultaneously
gini(matrix(sample_data, ncol = 2))

Grow a Duplication and Divergence Network

Description

Grows an already existing network by adding a node according to the Duplication and Divergence mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_DD(
  matrix,
  x,
  divergence,
  link = 0,
  connected = FALSE,
  retcon = FALSE,
  directed = TRUE
)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

link

Probability that the new node attaches to the node it duplicates. Defaults to 0.

connected

Binary argument determining if the newly grown node has to be connected to the existing network. Defaults to FALSE, to prevent rare computational slow-downs when it is unlikely to create a connected network. Defaults to FALSE.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_DD(matrix = new_network_prep, x = size + 1, divergence = 0.5)

Grow a Duplication and Mutation Network

Description

Grows an already existing network by adding a node according to the Duplication and Mutation mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_DM(
  matrix,
  x,
  divergence,
  mutation = 0,
  link = 0,
  connected = FALSE,
  retcon = FALSE,
  directed = TRUE
)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

mutation

Probability that the new node gains edges not associated with the node it duplicates. Needs to be between zero and one.

link

Probability that the new node attaches to the node it duplicates. Defaults to 0.

connected

Binary argument determining if the newly grown node has to be connected to the existing network. Defaults to FALSE, to prevent rare computational slow-downs when it is unlikely to create a connected network. Defaults to FALSE.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_DM(matrix = new_network_prep, x = size + 1, divergence = 0.5)

Grow an Erdos-Renyi Random Network

Description

Grows an already existing network by adding a node according to the Erdos-Renyi random mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_ER(matrix, x, p, retcon = FALSE, directed = TRUE)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

p

Probability possible edges exist. Needs to be between zero and one.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Erdos, P. and Renyi, A., On random graphs, Publicationes Mathematicae 6, 290–297 (1959).

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_ER(matrix = new_network_prep, x = size + 1, p = 0.5)

Grow a Niche Model Network

Description

Grows an already existing network by adding a node according to the Niche Model mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_NM(matrix, x, niches, connectance = 0.2, directed = TRUE, retcon = FALSE)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

niches

Vector of length x, with values between zero and one corresponding to each node's niche.

connectance

Niche Model parameter specifying the expected connectivity of the network, which determines for a given node the niche space window within which it attaches to every other node. Defaults to 0.2.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

Details

Stirs a node in a Niche Model network.

Value

An adjacency matrix.

References

Williams, R. J., & Martinez, N. D. (2000). Simple rules yield complex food webs. Nature, 404(6774), 180-183.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_NM(matrix = new_network_prep, x = size + 1, niches = stats::runif(size))

Grow a Preferential Attachment Network

Description

Grows an already existing network by adding a node according to the Preferential Attachment mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_PA(
  matrix,
  x,
  power,
  sum_v_max = "sum",
  nascent_help = TRUE,
  retcon = FALSE,
  directed = TRUE
)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

power

Power of attachment, which determines how much new nodes prefer to attach to nodes that have many edges compared to few. Needs to be positive.

sum_v_max

Degree distributions must be normalized, either by their "max" or "sum". Defaults to "max".

nascent_help

Should a single edge be added to the degree distribution of all nodes so that nodes with a zero in-degree can still have a chance of being attached to by new nodes. Defaults to TRUE.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Adds a node in a network according to the Preferential Attachment mechanism.

Value

An adjacency matrix.

References

Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. science, 286(5439), 509-512.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_PA(matrix = new_network_prep, x = size + 1, power = 2.15)

Grow a Small-World Network

Description

Grows an already existing network by adding a node according to the Small-World mechanism. Nodes can only attach to previously grown nodes.

Usage

grow_SW(matrix, x, rewire, connected = FALSE, retcon = FALSE, directed = TRUE)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

rewire

Small-World parameter specifying the probability each edge is randomly rewired, allowing for the possiblity of bridges between connected communities.

connected

Binary argument determining if the newly grown node has to be connected to the existing network. Defaults to FALSE, to prevent rare computational slow-downs when it is unlikely to create a connected network. Defaults to False.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Grows a node in a network according to the Small-World mechanism.

Value

An adjacency matrix.

References

Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’networks. nature, 393(6684), 440-442.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- grow_SW(matrix = new_network_prep, x = size + 1, rewire = 0.213)

Induced Conserved Structure (ICS)

Description

Calculates the Induced Conserved Structure proposed by Patro and Kingsford (2012) of an alignment between two networks.

Usage

ics(network_1_input, network_2_input, alignment, flip = FALSE)

Arguments

network_1_input

The first network being aligned, which must be in matrix form. If the two networks are of different sizes, it will be easier to interpret the output if this is the smaller one.

network_2_input

The second network, which also must be a matrix.

alignment

A matrix, such as is output by the function NetCom, where the first two columns contain corresponding node IDs for the two networks that were aligned.

flip

Defaults to FALSE. Set to TRUE if the first network is larger than the second. This is necessary because ICS is not a symmetric measure of alignment quality.

Value

A number ranging between 0 and 1. If the Induced Conserved Structure is 1, the two networks are isomorphic (identical) under the given alignment.

References

Patro, R., & Kingsford, C. (2012). Global network alignment using multiscale spectral signatures. Bioinformatics, 28(23), 3105-3114.

Examples

# Note that ICS is only defined on unweighted networks.
net_one <- round(matrix(runif(25,0,1), nrow=5, ncol=5))
net_two <- round(matrix(runif(25,0,1), nrow=5, ncol=5))

ics(net_two, net_two, align(net_one, net_two)$alignment)

Makes a Duplication and Divergence Network

Description

Makes a network according to the Duplication and Divergence mechanism.

Usage

make_DD(size, net_kind, divergence, directed = TRUE)

Arguments

size

Number of nodes in the network.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list").

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

directed

Whether the target network is directed. Defaults to TRUE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

# Network size (number of nodes)
size <- 10

# Divergence parameter
divergence <- 0.237

# Make network according to the Duplication & Divergence mechanism
make_DD(size = size, net_kind = "matrix", divergence = divergence)

Make a Duplication and Mutation Network

Description

Make an already existing network according to the Duplication and Mutation mechanism.

Usage

make_DM(size, net_kind, divergence, mutation, directed = FALSE)

Arguments

size

Number of nodes in the network.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list").

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

mutation

Probability that the new node gains edges not associated with the node it duplicates. Needs to be between zero and one.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

# Network size (number of nodes)
size <- 10

# Divergence parameter
divergence <- 0.237

# Mutation parameter
mutation <- 0.1

# Make network according to the Duplication & Mutation mechanism
make_DM(size = size, net_kind = "matrix", divergence = divergence, mutation = mutation)

Make a Mixture Mechanism Network

Description

Creates a network by iteratively adding or rewiring nodes, each capable of attaching to existing nodes according to a user-specified mechanism.

Usage

make_Mixture(
  mechanism,
  directed,
  parameter,
  kind,
  size,
  niches,
  retcon = FALSE,
  link_DD = 0,
  link_DM = 0,
  force_connected = FALSE
)

Arguments

mechanism

A vector of mechanism names corresponding to the mechanisms each node acts in accordance with. Note that the first two mechanisms are irrelevant because the first two nodes default to connecting to each other. Currently supported mechanisms: "ER" (Erdos-Renyi random), "PA", (Preferential Attachment), "DD", (Duplication and Divergence), "DM" (Duplication and Mutation), "SW", (Small-World), and "NM" (Niche Model).

directed

A binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Either a single value or a vector of values the same length as the mechanism input vector.

parameter

Parameter of each node's mechanism. Either a single value or a vector of values the same length as the mechanism input vector.

kind

Either 'grow' or 'rewire', and determines if the nodes specified in the mechanism input vector are to be rewired or grown. Either a single value or a vector of values the same length as the mechanism input vector. The number of 'grow' nodes, excluding the first two which are always a pair of bidirectionally connected nodes, is the size of the final network.

size

Typically not specified. The size of the network depends on how many 'grow' events are part of the 'kind' input sequence. This should only be used when all four components of the network evolution ('mechanism', 'kind', 'parameter', and 'directed') are single name inputs instead of vectors.

niches

Used by the Niche Model to determine which nodes interact. Needs to be a vector of the same length as the number of nodes, and range between zero and one.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

link_DD

Defaults to 0. A second parameter in the DD (Duplication & Divergence). Currently only one parameter per mechanism can be specified.

link_DM

Defaults to 0. A second parameter in the DM (Duplication & Mutation). Currently only one parameter per mechanism can be specified.

force_connected

Defaults to FALSE. Determines if nodes can be added to the growing network that are disconnected. If TRUE, this is prevented by re-determining the offending node's edges until the network is connected.

Details

This function grows, one node at a time, a mixture mechanism network. As each node is added to the growing network it can attach to existing nodes by its own node-specific mechanism. A sequence of mechanism names must be provided. Note: Currently each mechanism is assumed to have a single governing parameter.

Value

An unweighted mixture mechanism adjacency matrix.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Start by creating a sequence of network evolutions. 
# There are four components to this sequence that can each be defined for every step 
# in the network's evolution. Or, you can also specify a component once which will 
# be used for every step in the newtwork's evolution.

mechanism <- c(
    rep("ER", 7),
    rep("PA", 2),
    rep("ER", 3)
)

kind <- c(
    rep("grow", 7),
    rep("rewire", 2),
    rep("grow", 3)
)

parameter <- c(
    rep(0.3, 7),
    rep(2, 2),
    rep(0.3, 3)
)
directed <- c(
    rep(TRUE, 7),
    rep(FALSE, 2),
    rep(TRUE, 3)
)

# Simulate a network according to the rules of this system evolution.
network <- make_Mixture(
     mechanism = mechanism, 
     kind = kind, 
     parameter = parameter, 
     directed = directed
)

Make a Niche Model network

Description

Creates a single network according to the Niche Model. Can be directed or undirected, but is always unweighted.

Usage

make_NM(
  size,
  niches,
  net_kind = "matrix",
  connectance = 0.1,
  directed = TRUE,
  grow = FALSE
)

Arguments

size

The number of nodes in the network. Must be a positive integer.

niches

A vector of numbers specifying the niche of each member of the system (node). Each niche value must be element of [0,1].

net_kind

The format of the network. Currently must be either 'matrix' or 'list'.

connectance

Defaults to 0.5. The ratio of actual interactions to possible interactions. Effects the beta distributed width of niche values each member of the system (node) interacts with.

directed

If FALSE all interactions will be made symmetric. Note that the process of creating interactions is unaffected by this choice. Defaults to TRUE.

grow

Binary argument that determines if the network should be made in a growing fashion, where nodes' edges are added in order of their niches and can only attach to previously considered nodes. Defaults to FALSE.

Value

An interaction matrix format of a Niche Model network.

References

Williams, R. J., & Martinez, N. D. (2000). Simple rules yield complex food webs. Nature, 404(6774), 180-183.

Examples

# Import netcom
library(netcom)

# Network size (number of nodes)
size <- 10

# Create niche values for each member of the system (node)
niches <- stats::runif(n = size)

# Make network according to the Niche Model
make_NM(size = size, niches = niches)

Mechanism Null Distributions

Description

Creates a null distribution for a mechanism and parameter combination.

Usage

make_Null(
  input_network,
  net_kind,
  mechanism_kind,
  process,
  parameter,
  net_size,
  iters,
  method,
  neighborhood,
  DD_kind,
  DD_weight,
  directed,
  resolution_min = 0.01,
  resolution_max = 0.99,
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  best_fit_sd = 0,
  cores = 1,
  size_different = FALSE,
  cause_orientation = "row",
  max_norm = FALSE,
  verbose = FALSE
)

Arguments

input_network

The network for which to create a null distribution.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

mechanism_kind

Either "canonical" or "grow" can be used to simulate networks. If "grow" is used, note that here it will only simulate pure mixtures made of a single mechanism.

process

Name of mechanism. Currently only "ER", "PA", "DD", "DM" "SW", and "NM" are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. DM = Duplication and Mutation. SW = Small World. NM = Niche Model.

parameter

Parameter in the governing mechanism.

net_size

Number of nodes in the network.

iters

Number of replicates in the null distribution. Note that length(null_dist) = ((iters^2)-iters)/2.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

DD_kind

= A vector of network properties to be used to compare networks.

DD_weight

= A vector of weights for the relative importance of the network properties in DD_kind being used to compare networks. Should be the same length as DD_kind.

directed

Whether the target network is directed.

resolution_min

The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

best_fit_sd

Defaults to 0.01. Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter artificially inflates the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but also computationally intractable for realistically large systems.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

size_different

If there is a difference in the size of the networks used in the null distribution. Defaults to FALSE.

cause_orientation

The orientation of directed adjacency matrices. Defaults to "row".

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

verbose

Defaults to FALSE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Mechanism Null Distributions

Description

Creates a null distribution for a mechanism and parameter combination.

Usage

make_Null_canonical(
  input_network,
  net_kind,
  process,
  parameter,
  net_size,
  iters,
  method,
  neighborhood,
  DD_kind,
  DD_weight,
  directed,
  resolution_min = 0.01,
  resolution_max = 0.99,
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  best_fit_sd = 0,
  cores = 1,
  size_different = FALSE,
  cause_orientation = "row",
  max_norm = FALSE,
  verbose = FALSE
)

Arguments

input_network

The network for which to create a null distribution.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

process

Name of mechanism. Currently only "ER", "PA", "DD", "DM" "SW", and "NM" are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. DM = Duplication and Mutation. SW = Small World. NM = Niche Model.

parameter

Parameter in the governing mechanism.

net_size

Number of nodes in the network.

iters

Number of replicates in the null distribution. Note that length(null_dist) = ((iters^2)-iters)/2.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

DD_kind

A vector of network properties to be used to compare networks.

DD_weight

A vector of weights for the relative importance of the network properties in DD_kind being used to compare networks. Should be the same length as DD_kind.

directed

Whether the target network is directed.

resolution_min

The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

best_fit_sd

Defaults to 0.01. Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter artificially inflates the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but also computationally intractable for realistically large systems.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

size_different

If there is a difference in the size of the networks used in the null distribution. Defaults to FALSE.

cause_orientation

The orientation of directed adjacency matrices. Defaults to "row".

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

verbose

Defaults to FALSE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Mechanism Null Distributions

Description

Creates a null distribution for a mechanism and parameter combination.

Usage

make_Null_mixture(
  input_network,
  net_kind,
  process,
  parameter,
  net_size,
  iters,
  method,
  neighborhood,
  DD_kind,
  DD_weight,
  directed,
  resolution_min = 0.01,
  resolution_max = 0.99,
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  best_fit_sd = 0,
  cores = 1,
  size_different = FALSE,
  cause_orientation = "row",
  max_norm = FALSE,
  verbose = FALSE
)

Arguments

input_network

The network for which to create a null distribution.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

process

Name of mechanism. Currently only "ER", "PA", "DD", "DM" "SW", and "NM" are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. DM = Duplication and Mutation. SW = Small World. NM = Niche Model.

parameter

Parameter in the governing mechanism.

net_size

Number of nodes in the network.

iters

Number of replicates in the null distribution. Note that length(null_dist) = ((iters^2)-iters)/2.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

DD_kind

A vector of network properties to be used to compare networks.

DD_weight

A vector of weights for the relative importance of the network properties in DD_kind being used to compare networks. Should be the same length as DD_kind.

directed

Whether the target network is directed.

resolution_min

The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

best_fit_sd

Defaults to 0.01. Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter artificially inflates the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but also computationally intractable for realistically large systems.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

size_different

If there is a difference in the size of the networks used in the null distribution. Defaults to FALSE.

cause_orientation

The orientation of directed adjacency matrices. Defaults to "row".

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one. Defaults to FALSE.

verbose

Defaults to FALSE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Makes a Small-World Network

Description

Make an already existing network according to the Small-World mechanism.

Usage

make_SW(size, rewire, neighborhood, net_kind = "matrix", directed = FALSE)

Arguments

size

The number of nodes in the network. Must be a positive integer.

rewire

Small-World parameter specifying the probability each edge is randomly rewired, allowing for the possiblity of bridges between connected communities.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

net_kind

The format of the network. Currently must be either 'matrix' or 'list'.x

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

Details

Rewires a node in a network according to the Small-World mechanism.

Value

An adjacency matrix.

References

Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’networks. nature, 393(6684), 440-442.

Examples

# Import netcom
library(netcom)

# Network size (number of nodes)
size <- 10

# Rewiring parameter
rewire <- 0.2

# Make network according to the Small-World mechanism
make_SW(size = size, net_kind = "matrix", rewire = rewire)

Systematically Make Networks

Description

Creates a list of networks that systematically spans mechanisms and their respective parameters.

Usage

make_Systematic(
  net_size,
  neighborhood,
  directed = TRUE,
  net_kind = "matrix",
  mechanism_kind = "canonical",
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  canonical = FALSE,
  cores = 1,
  verbose = TRUE
)

Arguments

net_size

Number of nodes in the network.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

directed

Whether the target network is directed. Defaults to TRUE.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

mechanism_kind

Either "canonical" or "grow" can be used to simulate networks. If "grow" is used, note that here it will only simulate pure mixtures made of a single mechanism. Defaults to "canonical".

resolution

The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy. Defaults to 100.

resolution_min

= The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

canonical

Defautls to FALSE. If TRUE the mechanisms are directed or undirected in accordance with their canonical forms. This negates the value of 'directed'.

cores

Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

verbose

Defaults to TRUE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Systematically Make Networks

Description

Creates a list of networks that systematically spans mechanisms and their respective parameters.

Usage

make_Systematic_canonical(
  net_size,
  neighborhood,
  directed = TRUE,
  net_kind = "matrix",
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  cores = 1,
  verbose = TRUE
)

Arguments

net_size

Number of nodes in the network.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

directed

Whether the target network is directed. Defaults to TRUE.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

resolution

The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy. Defaults to 100.

resolution_min

= The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

power_max

= Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

= Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

= Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

= Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

cores

= Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

verbose

= Defaults to TRUE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Systematically Make Networks

Description

Creates a list of networks that systematically spans mechanisms and their respective parameters.

Usage

make_Systematic_directedCanonicalLike(
  net_size,
  directed = TRUE,
  net_kind = "matrix",
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  cores = 1,
  verbose = TRUE
)

Arguments

net_size

Number of nodes in the network.

directed

Whether the target network is directed. Defaults to TRUE.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

resolution

The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy. Defaults to 100.

resolution_min

= The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

power_max

= Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

= Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

= Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

= Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

cores

= Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

verbose

= Defaults to TRUE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Systematically Make Networks

Description

Creates a list of networks that systematically spans mechanisms and their respective parameters.

Usage

make_Systematic_mixture(
  net_size,
  neighborhood,
  directed = TRUE,
  net_kind = "matrix",
  resolution = 100,
  resolution_min = 0.01,
  resolution_max = 0.99,
  reps = 3,
  processes = c("ER", "PA", "DM", "SW", "NM"),
  power_max = 5,
  connectance_max = 0.5,
  divergence_max = 0.5,
  mutation_max = 0.5,
  canonical = FALSE,
  cores = 1,
  verbose = TRUE
)

Arguments

net_size

Number of nodes in the network.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

directed

Whether the target network is directed. Defaults to TRUE.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list"). Defaults to "matrix".

resolution

The first step is to find the version of each process most similar to the target network. This parameter sets the number of parameter values to search across. Decrease to improve performance, but at the cost of accuracy. Defaults to 100.

resolution_min

= The minimum parameter value to consider. Zero is not used because in many processes it results in degenerate systems (e.g. entirely unconnected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.01.

resolution_max

The maximum parameter value to consider. One is not used because in many processes it results in degenerate systems (e.g. entirely connected networks). Currently process agnostic. Future versions will accept a vector of values, one for each process. Defaults to 0.99.

reps

Defaults to 3. The number of networks to simulate for each parameter. More replicates increases accuracy by making the estimation of the parameter that produces networks most similar to the target network less idiosyncratic.

processes

Defaults to c("ER", "PA", "DD", "SW", "NM"). Vector of process abbreviations. Currently only the default five are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. SW = Small World. NM = Niche Model.

power_max

Defaults to 5. The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

Defaults to 0.5. The maximum connectance parameter for the Niche Model.

divergence_max

Defaults to 0.5. The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

mutation_max

Defaults to 0.5. The maximum mutation parameter for the Duplication and Mutation mechanism.

canonical

Defautls to FALSE. If TRUE the mechanisms are directed or undirected in accordance with their canonical forms. This negates the value of 'directed'.

cores

= Defaults to 1. The number of cores to run the classification on. When set to 1 parallelization will be ignored.

verbose

= Defaults to TRUE. Whether to print all messages.

Details

Produces ground-truthing network data.

Value

A list. The first element contains the networks. The second contains their corresponding parameters.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

make_Systematic(net_size = 10)

Empirical parameterization via null distributions

Description

Helper function to find the best fitting version of a mechanism by searching across the null distributions associated with a process + parameter combination.

Usage

null_fit_optim(
  parameter,
  process,
  network,
  net_size,
  iters,
  neighborhood,
  directed,
  DD_kind,
  DD_weight,
  net_kind,
  mechanism_kind,
  method,
  size_different,
  power_max,
  connectance_max,
  divergence_max,
  best_fit_sd,
  max_norm,
  cause_orientation,
  cores,
  null_dist_trim,
  ks_dither,
  ks_alternative,
  verbose = FALSE
)

Arguments

parameter

The parameter being tested for its ability to generate networks alike the input 'network'.

process

Name of mechanism. Currently only "ER", "PA", "DD", "DM" "SW", and "NM" are supported. Future versions will accept user-defined network-generating functions and associated parameters. ER = Erdos-Renyi random. PA = Preferential Attachment. DD = Duplication and Divergence. DM = Duplication and Mutation. SW = Small World. NM = Niche Model.

network

The network being compared to a hypothesized 'process' with a given 'parameter' value.

net_size

Number of nodes in the network.

iters

Number of replicates in the null distribution. Note that length(null_dist) = ((iters^2)-iters)/2.

neighborhood

The range of nodes that form connected communities. Note: This implementation results in overlap of communities.

directed

Whether the target network is directed.

DD_kind

A vector of network properties to be used to compare networks.

DD_weight

A vector of weights for the relative importance of the network properties in DD_kind being used to compare networks. Should be the same length as DD_kind.

net_kind

If the network is an adjacency matrix ("matrix") or an edge list ("list").

mechanism_kind

Either "canonical" or "grow" can be used to simulate networks. If "grow" is used, note that here it will only simulate pure mixtures made of a single mechanism.

method

This determines the method used to compare networks at the heart of the classification. Currently "DD" (Degree Distribution) and "align" (the align function which compares networks by the entropy of diffusion on them) are supported. Future versions will allow user-defined methods.

size_different

If there is a difference in the size of the networks used in the null distribution.

power_max

The maximum power of attachment in the Preferential Attachment process (PA).

connectance_max

The maximum connectance parameter for the Niche Model.

divergence_max

The maximum divergence parameter for the Duplication and Divergence/Mutation mechanisms.

best_fit_sd

Standard Deviation used to simulate networks with a similar but not identical best fit parameter. This is important because simulating networks with the identical parameter artificially inflates the false negative rate by assuming the best fit parameter is the true parameter. For large resolution and reps values this will become true, but also computationally intractable for realistically large systems.

max_norm

Binary variable indicating if each network property should be normalized so its max value (if a node-level property) is one.

cause_orientation

The orientation of directed adjacency matrices.

cores

The number of cores to run the classification on. When set to 1 parallelization will be ignored.

null_dist_trim

= Number between zero and one that determines how much of each network comparison distribution (unknown network compared to simulated networks, simulated networks compared to each other) should be used. Prevents p-value convergence with large sample sizes. Defaults to 1, which means all comparisons are used (no trimming).

ks_dither

The KS test cannot compute exact p-values when every pairwise network distance is not unique. Adding small amounts of noise makes each distance unique. We are not aware of a study on the impacts this has on accuracy so it is set to zero by default.

ks_alternative

Governs the KS test. Assuming best_fit_sd is not too large, this can be set to "greater" because the target network cannot be more alike identically simulated networks than they are to each other. In practice we have found "greater" and "less" produce numerical errors. Only "two.sided", "less", and "greater" are supported through stats::ks.test().

verbose

Defaults to TRUE. Whether to print all messages.

Details

Note: Currently each process is assumed to have a single governing parameter.

Value

A number measuring how different the input network is from the parameter + process combination.

References

Langendorf, R. E., & Burgess, M. G. (2020). Empirically Classifying Network Mechanisms. arXiv preprint arXiv:2012.15863.

Examples

# Import netcom
library(netcom)

# Adjacency matrix
size <- 10
network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)

# Calculate how similar the input network is to Small-World networks with 
# a rewiring probability of 0.28.
null_fit_optim(
     parameter = 0.28, 
     process = "SW", 
     network = network, 
     net_size = 12, 
     iters = 20,
     neighborhood = max(1, round(0.1 * net_size)),
     net_kind = "matrix", 
     mechanism_kind = "grow", 
     power_max = 5, 
     connectance_max = 0.5, 
     divergence_max = 0.5, 
     cores = 1, 
     directed = TRUE, 
     method = "DD", 
     size_different = FALSE,
     cause_orientation = "row", 
     DD_kind = c(
         "in", "out", "entropy_in", "entropy_out", 
         "clustering_coefficient", "page_rank", "communities"
     ), 
     DD_weight = 1, 
     best_fit_sd = 0,
     max_norm = FALSE,
     null_dist_trim = 0,
     ks_dither = 0,
     ks_alternative = "two.sided",
     verbose = FALSE
)

Sitrs a Duplication and Divergence Network

Description

Stirs an already existing network by rewiring a node according to the Duplication and Divergence mechanism.

Usage

stir_DD(
  matrix,
  x,
  divergence,
  directed = TRUE,
  link = 0,
  force_connected = FALSE
)

Arguments

matrix

Existing network to be rewired (stirred).

x

The ID of the node to be grown.

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix.

link

Probability that the new node attaches to the node it duplicates. Defaults to 0.

force_connected

Binary argument determining if the newly grown node has to be connected to the existing network. Defaults to FALSE, to prevent rare computational slow-downs when it is unlikely to create a connected network. Defaults to FALSE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_DD(matrix = new_network_prep, x = size + 1, divergence = 0.5)

Stirs a Duplication and Mutation Network

Description

Stirs an already existing network by rewiring a node according to the Duplication and Mutation mechanism.

Usage

stir_DM(
  matrix,
  x,
  divergence,
  mutation,
  directed = TRUE,
  link = 0,
  force_connected = FALSE
)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be rewired (stirred).

divergence

Probability that the new node loses edges associated with the node it duplicates. Needs to be between zero and one.

mutation

Probability that the new node gains edges not associated with the node it duplicates. Needs to be between zero and one.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix.

link

Probability that the new node attaches to the node it duplicates. Defaults to 0.

force_connected

Binary argument determining if the newly grown node has to be connected to the existing network. Defaults to FALSE, to prevent rare computational slow-downs when it is unlikely to create a connected network. Defaults to FALSE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Ispolatov, I., Krapivsky, P. L., & Yuryev, A. (2005). Duplication-divergence model of protein interaction network. Physical review E, 71(6), 061911.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_DM(matrix = new_network_prep, x = size + 1, divergence = 0.5, mutation = 0.21)

Stir an Erdos-Renyi Random Network

Description

Stirs an already existing network by rewiring a node according to the Erdos-Renyi random mechanism.

Usage

stir_ER(matrix, x, p, directed = TRUE, retcon = FALSE)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be rewired (stirred).

p

Probability possible edges exist. Needs to be between zero and one.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

Details

Different from Duplication & Mutation models in that edges can only be lost.

Value

An adjacency matrix.

References

Erdos, P. and Renyi, A., On random graphs, Publicationes Mathematicae 6, 290–297 (1959).

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_ER(matrix = new_network_prep, x = size + 1, p = 0.5)

Stirs a Niche Model Network

Description

Stirs an already existing network by rewiring a node according to the Niche Model mechanism.

Usage

stir_NM(matrix, x, niches, directed = TRUE, connectance = 0.2)

Arguments

matrix

Existing network to experience rewiring (stirring).

x

The ID of the node to be grown.

niches

Vector of length x, with values between zero and one corresponding to each node's niche.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix. Defaults to TRUE.

connectance

Niche Model parameter specifying the expected connectivity of the network, which determines for a given node the niche space window within which it attaches to every other node. Defaults to 0.2.

Details

Stirs a node in a Niche Model network.

Value

An adjacency matrix.

References

Williams, R. J., & Martinez, N. D. (2000). Simple rules yield complex food webs. Nature, 404(6774), 180-183.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_NM(
     matrix = new_network_prep, 
     x = size + 1, 
     connectance = 0.1, 
     niches = runif(size + 1)
)

Stirs a Preferential Attachment Network

Description

Stirs an already existing network by rewiring a node according to the Preferential Attachment mechanism.

Usage

stir_PA(
  matrix,
  x,
  power,
  directed = TRUE,
  retcon = FALSE,
  sum_v_max = "max",
  nascent_help = TRUE
)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be rewired (stirred).

power

Power of attachment, which determines how much new nodes prefer to attach to nodes that have many edges compared to few. Needs to be positive.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix.

retcon

Binary variable determining if already existing nodes can attach to new nodes. Defaults to FALSE.

sum_v_max

Degree distributions must be normalized, either by their "max" or "sum". Defaults to "max".

nascent_help

Should a single edge be added to the degree distribution of all nodes so that nodes with a zero in-degree can still have a chance of being attached to by new nodes. Defaults to TRUE.

Details

Rewires a node in a network according to the Preferential Attachment mechanism.

Value

An adjacency matrix.

References

Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. science, 286(5439), 509-512.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_PA(matrix = new_network_prep, x = size + 1, power = 2.15)

Stirs a Small-World Network

Description

Stirs an already existing network by rewiring a node according to the Small-World mechanism.

Usage

stir_SW(matrix, x, rewire, directed = TRUE)

Arguments

matrix

Existing network to experience growth.

x

The ID of the node to be grown.

rewire

Small-World parameter specifying the probability each edge is randomly rewired, allowing for the possiblity of bridges between connected communities.

directed

Binary variable determining if the network is directed, resulting in off-diagonal asymmetry in the adjacency matrix.

Details

Rewires a node in a network according to the Small-World mechanism.

Value

An adjacency matrix.

References

Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’networks. nature, 393(6684), 440-442.

Examples

# Import netcom
library(netcom)

size <- 10
existing_network <- matrix(sample(c(0,1), size = size^2, replace = TRUE), nrow = size, ncol = size)
new_network_prep <- matrix(0, nrow = size + 1, ncol = size + 1)
new_network_prep[1:size, 1:size] = existing_network
new_network <- stir_SW(matrix = new_network_prep, x = size + 1, rewire = 0.213)