Wikipedia talk (en)

This is the communication network of the English Wikipedia. Nodes represent users, and an edge from user A to user B denotes that user A wrote a message on the talk page of user B at a certain timestamp.

Metadata

CodeTen
Internal namewiki_talk_en
NameWikipedia talk (en)
Data sourcehttps://zenodo.org/record/49561
AvailabilityDataset is available for download
Consistency checkDataset passed all tests
Category
Communication network
Dataset timestamp 2017-10-27
Node meaningUser
Edge meaningMessage
Network formatUnipartite, directed
Edge typeUnweighted, multiple edges
Temporal data Edges are annotated with timestamps
ReciprocalContains reciprocal edges
Directed cyclesContains directed cycles
LoopsContains loops

Statistics

Size n =2,987,535
Volume m =24,981,163
Unique edge count m̿ =9,379,561
Loop count l =5,655,527
Wedge count s =57,066,712,805
Claw count z =1,510,569,161,023,742
Cross count x =4.396 49 × 1019
Triangle count t =41,915,754
Square count q =22,498,726,804
Maximum degree dmax =488,182
Maximum outdegree d+max =488,169
Maximum indegree dmax =121,250
Average degree d =16.723 6
Fill p =1.050 89 × 10−6
Average edge multiplicity m̃ =2.663 36
Size of LCC N =2,859,574
Size of LSCC Ns =249,610
Relative size of LSCC Nrs =0.083 550 5
Diameter δ =9
50-Percentile effective diameter δ0.5 =3.233 96
90-Percentile effective diameter δ0.9 =3.877 48
Median distance δM =4
Mean distance δm =3.658 54
Gini coefficient G =0.899 562
Balanced inequality ratio P =0.109 987
Outdegree balanced inequality ratio P+ =0.073 885 3
Indegree balanced inequality ratio P =0.160 171
Relative edge distribution entropy Her =0.785 090
Power law exponent γ =2.827 07
Tail power law exponent γt =1.811 00
Degree assortativity ρ =−0.096 231 0
Degree assortativity p-value pρ =0.000 00
Clustering coefficient c =0.002 203 51
Directed clustering coefficient c± =0.017 955 2
Spectral norm α =90,816.2
Operator 2-norm ν =45,410.1
Cyclic eigenvalue π =45,406.0
Spectral separation 1[A] / λ2[A]| =1.271 98
Reciprocity y =0.214 524
Non-bipartivity bA =0.829 994
Normalized non-bipartivity bN =0.031 296 9
Algebraic non-bipartivity χ =0.064 187 3
Spectral bipartite frustration bK =0.002 707 63
Controllability C =2,488,813
Relative controllability Cr =0.833 066

Plots

Degree distribution

Cumulative degree distribution

Lorenz curve

Spectral distribution of the adjacency matrix

Spectral distribution of the normalized adjacency matrix

Spectral distribution of the Laplacian

Spectral graph drawing based on the adjacency matrix

Spectral graph drawing based on the Laplacian

Spectral graph drawing based on the normalized adjacency matrix

Degree assortativity

Zipf plot

Hop distribution

In/outdegree scatter plot

Edge weight/multiplicity distribution

Clustering coefficient distribution

Temporal distribution

SynGraphy

Inter-event distribution

Matrix decompositions plots

Downloads

References

[1] Jérôme Kunegis. KONECT – The Koblenz Network Collection. In Proc. Int. Conf. on World Wide Web Companion, pages 1343–1350, 2013. [ http ]
[2] Jun Sun, Jérôme Kunegis, and Steffen Staab. Predicting user roles in social networks using transfer learning with feature transformation. In Proc. ICDM Workshop on Data Min. in Netw., 2016.