PubMed
This is the bipartite document–word dataset of PubMed. Left nodes are
documents and right nodes are words. Edge weights are multiplicities.
Metadata
Statistics
| Size | n = | 8,341,043
|
| Left size | n1 = | 8,200,000
|
| Right size | n2 = | 141,043
|
| Volume | m = | 737,869,083
|
| Unique edge count | m̿ = | 483,450,157
|
| Wedge count | s = | 42,676,090,519,343
|
| Maximum degree | dmax = | 2,323,263
|
| Maximum left degree | d1max = | 436
|
| Maximum right degree | d2max = | 2,323,263
|
| Average degree | d = | 176.925
|
| Average left degree | d1 = | 89.984 0
|
| Average right degree | d2 = | 5,231.52
|
| Average edge multiplicity | m̃ = | 1.526 26
|
| Size of LCC | N = | 8,341,043
|
| 50-Percentile effective diameter | δ0.5 = | 1.822 50
|
| 90-Percentile effective diameter | δ0.9 = | 3.731 23
|
| Mean distance | δm = | 2.764 15
|
| Balanced inequality ratio | P = | 0.283 092
|
| Left balanced inequality ratio | P1 = | 0.413 586
|
| Right balanced inequality ratio | P2 = | 0.102 089
|
Plots
Matrix decompositions plots
Downloads
References
|
[1]
|
Jérôme Kunegis.
KONECT – The Koblenz Network Collection.
In Proc. Int. Conf. on World Wide Web Companion, pages
1343–1350, 2013.
[ http ]
|
|
[2]
|
M. Lichman.
UCI Machine Learning Repository, 2013.
[ http ]
|