Efficient simulation and analysis of mid-sized networks

Luis E. Castro, Xu Dong, Nazrul I Shaikh

Research output: Contribution to journalArticle

Abstract

There is growing interest in developing the abilities to simulate realistic social networks and analyze data generated from existing online social networks such as Facebook and Twitter. Amongst other things, researchers and practitioners need these abilities to study how opinions and information diffuse over networks and identify the influential agents in networks. However, the sizes of the social networks that need to be simulated and the amount of user generated data that needs to be analyzed is growing at a faster rate than the computational power of most of the modern day computers. This paper presents a memory efficient network representation and computational resource allocation algorithm that yields a scale-up of about 400; thus, given a constraint on the availability of computational resources, researchers can now use the proposed algorithm to simulate and analyze networks that are more than 100 times larger than what they could simulate otherwise. The proposed network representation is conducive to multi-core processing and random node sampling. Algorithms for computationally efficient execution of three random-node-sampling-based methods to estimate network metrics such as the network diameter and average path length are also presented in the paper. These algorithms yield a speed-up of about 40 even when the researcher requires a precision of more than 98%. The scale-up and speed-up numbers are based on a detailed performance analysis of the proposed algorithms that was conducted on synthetic networks of sizes ranging from 1000 to 1,000,000 nodes. The observed scale-up and speed-up performance of the proposed algorithms has been validated against the algorithms used in igraph and statnet-two popular network data analysis software package, and these results are also presented in this paper.

Original languageEnglish (US)
Pages (from-to)273-288
Number of pages16
JournalComputers and Industrial Engineering
Volume119
DOIs
StatePublished - May 1 2018

Fingerprint

Sampling
Software packages
Resource allocation
Availability
Data storage equipment
Processing

Keywords

  • Computational efficiency
  • Egocentric networks
  • Multi-core processing
  • Network simulation
  • Node sampling
  • Vectorization

ASJC Scopus subject areas

  • Computer Science(all)
  • Engineering(all)

Cite this

Efficient simulation and analysis of mid-sized networks. / Castro, Luis E.; Dong, Xu; Shaikh, Nazrul I.

In: Computers and Industrial Engineering, Vol. 119, 01.05.2018, p. 273-288.

Research output: Contribution to journalArticle

@article{1d643c8334b54a2d8d87f4a3a22c08d1,
title = "Efficient simulation and analysis of mid-sized networks",
abstract = "There is growing interest in developing the abilities to simulate realistic social networks and analyze data generated from existing online social networks such as Facebook and Twitter. Amongst other things, researchers and practitioners need these abilities to study how opinions and information diffuse over networks and identify the influential agents in networks. However, the sizes of the social networks that need to be simulated and the amount of user generated data that needs to be analyzed is growing at a faster rate than the computational power of most of the modern day computers. This paper presents a memory efficient network representation and computational resource allocation algorithm that yields a scale-up of about 400; thus, given a constraint on the availability of computational resources, researchers can now use the proposed algorithm to simulate and analyze networks that are more than 100 times larger than what they could simulate otherwise. The proposed network representation is conducive to multi-core processing and random node sampling. Algorithms for computationally efficient execution of three random-node-sampling-based methods to estimate network metrics such as the network diameter and average path length are also presented in the paper. These algorithms yield a speed-up of about 40 even when the researcher requires a precision of more than 98{\%}. The scale-up and speed-up numbers are based on a detailed performance analysis of the proposed algorithms that was conducted on synthetic networks of sizes ranging from 1000 to 1,000,000 nodes. The observed scale-up and speed-up performance of the proposed algorithms has been validated against the algorithms used in igraph and statnet-two popular network data analysis software package, and these results are also presented in this paper.",
keywords = "Computational efficiency, Egocentric networks, Multi-core processing, Network simulation, Node sampling, Vectorization",
author = "Castro, {Luis E.} and Xu Dong and Shaikh, {Nazrul I}",
year = "2018",
month = "5",
day = "1",
doi = "10.1016/j.cie.2018.03.008",
language = "English (US)",
volume = "119",
pages = "273--288",
journal = "Computers and Industrial Engineering",
issn = "0360-8352",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Efficient simulation and analysis of mid-sized networks

AU - Castro, Luis E.

AU - Dong, Xu

AU - Shaikh, Nazrul I

PY - 2018/5/1

Y1 - 2018/5/1

N2 - There is growing interest in developing the abilities to simulate realistic social networks and analyze data generated from existing online social networks such as Facebook and Twitter. Amongst other things, researchers and practitioners need these abilities to study how opinions and information diffuse over networks and identify the influential agents in networks. However, the sizes of the social networks that need to be simulated and the amount of user generated data that needs to be analyzed is growing at a faster rate than the computational power of most of the modern day computers. This paper presents a memory efficient network representation and computational resource allocation algorithm that yields a scale-up of about 400; thus, given a constraint on the availability of computational resources, researchers can now use the proposed algorithm to simulate and analyze networks that are more than 100 times larger than what they could simulate otherwise. The proposed network representation is conducive to multi-core processing and random node sampling. Algorithms for computationally efficient execution of three random-node-sampling-based methods to estimate network metrics such as the network diameter and average path length are also presented in the paper. These algorithms yield a speed-up of about 40 even when the researcher requires a precision of more than 98%. The scale-up and speed-up numbers are based on a detailed performance analysis of the proposed algorithms that was conducted on synthetic networks of sizes ranging from 1000 to 1,000,000 nodes. The observed scale-up and speed-up performance of the proposed algorithms has been validated against the algorithms used in igraph and statnet-two popular network data analysis software package, and these results are also presented in this paper.

AB - There is growing interest in developing the abilities to simulate realistic social networks and analyze data generated from existing online social networks such as Facebook and Twitter. Amongst other things, researchers and practitioners need these abilities to study how opinions and information diffuse over networks and identify the influential agents in networks. However, the sizes of the social networks that need to be simulated and the amount of user generated data that needs to be analyzed is growing at a faster rate than the computational power of most of the modern day computers. This paper presents a memory efficient network representation and computational resource allocation algorithm that yields a scale-up of about 400; thus, given a constraint on the availability of computational resources, researchers can now use the proposed algorithm to simulate and analyze networks that are more than 100 times larger than what they could simulate otherwise. The proposed network representation is conducive to multi-core processing and random node sampling. Algorithms for computationally efficient execution of three random-node-sampling-based methods to estimate network metrics such as the network diameter and average path length are also presented in the paper. These algorithms yield a speed-up of about 40 even when the researcher requires a precision of more than 98%. The scale-up and speed-up numbers are based on a detailed performance analysis of the proposed algorithms that was conducted on synthetic networks of sizes ranging from 1000 to 1,000,000 nodes. The observed scale-up and speed-up performance of the proposed algorithms has been validated against the algorithms used in igraph and statnet-two popular network data analysis software package, and these results are also presented in this paper.

KW - Computational efficiency

KW - Egocentric networks

KW - Multi-core processing

KW - Network simulation

KW - Node sampling

KW - Vectorization

UR - http://www.scopus.com/inward/record.url?scp=85044925650&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044925650&partnerID=8YFLogxK

U2 - 10.1016/j.cie.2018.03.008

DO - 10.1016/j.cie.2018.03.008

M3 - Article

VL - 119

SP - 273

EP - 288

JO - Computers and Industrial Engineering

JF - Computers and Industrial Engineering

SN - 0360-8352

ER -