This is the bipartite document–word dataset of PubMed. Left nodes are
documents and right nodes are words. Edge weights are multiplicities.
Size | n = | 8,341,043
Left size | n1 = | 8,200,000
Right size | n2 = | 141,043
Volume | m = | 737,869,083
Unique edge count | m̿ = | 483,450,157
Wedge count | s = | 42,676,090,519,343
Maximum degree | dmax = | 2,323,263
Maximum left degree | d1max = | 436
Maximum right degree | d2max = | 2,323,263
Average degree | d = | 176.925
Average left degree | d1 = | 89.984 0
Average right degree | d2 = | 5,231.52
Average edge multiplicity | m̃ = | 1.526 26
Size of LCC | N = | 8,341,043
50-Percentile effective diameter | δ0.5 = | 1.822 50
90-Percentile effective diameter | δ0.9 = | 3.731 23
Mean distance | δm = | 2.764 15
Balanced inequality ratio | P = | 0.283 092
Left balanced inequality ratio | P1 = | 0.413 586
Right balanced inequality ratio | P2 = | 0.102 089
Matrix decompositions plots
Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]
M. Lichman.
UCI Machine Learning Repository, 2013.
[ http ]