NY Times
This is the bipartite document–word dataset of NY Times. Left nodes are
documents and right nodes are words. Edge weights are multiplicities.
Metadata
Statistics
Size | n = | 401,388
|
Left size | n1 = | 299,752
|
Right size | n2 = | 101,636
|
Volume | m = | 99,542,125
|
Unique edge count | m̿ = | 69,679,427
|
Wedge count | s = | 621,479,000,671
|
Claw count | z = | 8,494,942,350,924,751
|
Maximum degree | dmax = | 108,622
|
Maximum left degree | d1max = | 2,017
|
Maximum right degree | d2max = | 108,622
|
Average degree | d = | 495.990
|
Average left degree | d1 = | 332.082
|
Average right degree | d2 = | 979.398
|
Average edge multiplicity | m̃ = | 1.428 57
|
Size of LCC | N = | 401,388
|
Diameter | δ = | 7
|
50-Percentile effective diameter | δ0.5 = | 1.873 39
|
90-Percentile effective diameter | δ0.9 = | 2.889 94
|
Median distance | δM = | 2
|
Mean distance | δm = | 2.486 43
|
Balanced inequality ratio | P = | 0.296 549
|
Left balanced inequality ratio | P1 = | 0.403 163
|
Right balanced inequality ratio | P2 = | 0.122 051
|
Power law exponent | γ = | 1.196 59
|
Degree assortativity | ρ = | −0.053 058 2
|
Degree assortativity p-value | pρ = | 0.000 00
|
Spectral separation | |λ1[A] / λ2[A]| = | 1.852 81
|
Plots
Matrix decompositions plots
Downloads
References
[1]
|
Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]
|
[2]
|
M. Lichman.
UCI Machine Learning Repository, 2013.
[ http ]
|