PubMed
This is the bipartite document–word dataset of PubMed. Left nodes are
documents and right nodes are words. Edge weights are multiplicities.
Metadata
Statistics
Size | n = | 8,341,043
|
Left size | n1 = | 8,200,000
|
Right size | n2 = | 141,043
|
Volume | m = | 737,869,083
|
Unique edge count | m̿ = | 483,450,157
|
Wedge count | s = | 42,676,090,519,343
|
Maximum degree | dmax = | 2,323,263
|
Maximum left degree | d1max = | 436
|
Maximum right degree | d2max = | 2,323,263
|
Average degree | d = | 176.925
|
Average left degree | d1 = | 89.984 0
|
Average right degree | d2 = | 5,231.52
|
Average edge multiplicity | m̃ = | 1.526 26
|
Size of LCC | N = | 8,341,043
|
50-Percentile effective diameter | δ0.5 = | 1.822 50
|
90-Percentile effective diameter | δ0.9 = | 3.731 23
|
Mean distance | δm = | 2.764 15
|
Balanced inequality ratio | P = | 0.283 092
|
Left balanced inequality ratio | P1 = | 0.413 586
|
Right balanced inequality ratio | P2 = | 0.102 089
|
Plots
Matrix decompositions plots
Downloads
References
[1]
|
Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]
|
[2]
|
M. Lichman.
UCI Machine Learning Repository, 2013.
[ http ]
|