TREC (disks 4–5)
This is the bipartite network of 556,000 text documents from the Text Retrieval
Conference's (TREC) Disks 4 and 5, containing 1.1 million words. Each edge
represents one documentword inclusion.
Metadata
Statistics
Size  n =  1,729,302

Left size  n_{1} =  556,077

Right size  n_{2} =  1,173,225

Volume  m =  151,632,178

Unique edge count  m̿ =  83,629,405

Wedge count  s =  1,604,790,310,718

Claw count  z =  58,090,382,597,882,208

Cross count  x =  3.122 × 10^{21}

Maximum degree  d_{max} =  457,437

Maximum left degree  d_{1max} =  30,701

Maximum right degree  d_{2max} =  457,437

Average degree  d =  175.368

Average left degree  d_{1} =  272.682

Average right degree  d_{2} =  129.244

Fill  p =  0.000 128 187

Average edge multiplicity  m̃ =  1.813 14

Size of LCC  N =  1,725,011

Diameter  δ =  7

50Percentile effective diameter  δ_{0.5} =  2.953 78

90Percentile effective diameter  δ_{0.9} =  3.808 95

Median distance  δ_{M} =  3

Mean distance  δ_{m} =  3.395 75

Gini coefficient  G =  0.854 045

Balanced inequality ratio  P =  0.168 945

Left balanced inequality ratio  P_{1} =  0.312 897

Right balanced inequality ratio  P_{2} =  0.033 487 1

Relative edge distribution entropy  H_{er} =  0.809 386

Power law exponent  γ =  1.503 70

Tail power law exponent  γ_{t} =  1.401 00

Degree assortativity  ρ =  −0.068 551 7

Degree assortativity pvalue  p_{ρ} =  0.000 00

Spectral norm  α =  44,117.0

Plots
Downloads
References
[1]

Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]

[2]

National Institute of Standards and Technology.
Text REtrieval Conference (TREC) English documents.
http://trec.nist.gov/data/docs_eng.html, August 2010.
Volume 4 & 5.
