Reuters
This is the bipartite network of story–word inclusions in documents that
appeared in Reuters news stories collected in the Reuters Corpus, Volume 1
(RCV1). Left nodes represent stories; right nodes represent words. An edge
represents a story–word inclusion.
Metadata
Statistics
Size  n =  1,065,176

Left size  n_{1} =  781,265

Right size  n_{2} =  283,911

Volume  m =  96,903,520

Unique edge count  m̿ =  60,569,726

Wedge count  s =  1,546,388,153,215

Claw count  z =  64,992,574,078,209,184

Cross count  x =  2.950 23 × 10^{21}

Maximum degree  d_{max} =  345,056

Maximum left degree  d_{1max} =  1,585

Maximum right degree  d_{2max} =  345,056

Average degree  d =  181.948

Average left degree  d_{1} =  124.034

Average right degree  d_{2} =  341.317

Fill  p =  0.000 273 071

Average edge multiplicity  m̃ =  1.599 87

Size of LCC  N =  1,065,175

Diameter  δ =  6

50Percentile effective diameter  δ_{0.5} =  2.097 31

90Percentile effective diameter  δ_{0.9} =  3.331 29

Median distance  δ_{M} =  3

Mean distance  δ_{m} =  2.693 82

Gini coefficient  G =  0.680 191

Balanced inequality ratio  P =  0.247 840

Left balanced inequality ratio  P_{1} =  0.345 764

Right balanced inequality ratio  P_{2} =  0.042 455 4

Relative edge distribution entropy  H_{er} =  0.826 289

Power law exponent  γ =  1.295 75

Tail power law exponent  γ_{t} =  2.511 00

Degree assortativity  ρ =  −0.124 689

Degree assortativity pvalue  p_{ρ} =  0.000 00

Spectral norm  α =  6,502.10

Spectral separation  λ_{1}[A] / λ_{2}[A] =  1.350 62

Plots
Downloads
References
[1]

Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]

[2]

David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li.
RCV1: A new benchmark collection for text categorization research.
J. Mach. Learn. Res., 5:361–397, 2004.
