Zhishi

This is the aggregated network of articles from the three Chinese online encyclopedias Baidu, Hudong and Wikipedia in Chinese, as released by the Zhishi project. The nodes in the networks are articles from the three encyclopedias, and links are both internal links in each encyclopedia, as well as external links from one encyclopedia to another.

Metadata

CodeZS
Internal namezhishi-all
NameZhishi
Data sourcehttp://zhishi.me/
AvailabilityDataset is available for download
Consistency checkDataset passed all tests
Category
Hyperlink network
Node meaningArticle
Edge meaningLink
Network formatUnipartite, directed
Edge typeUnweighted, multiple edges
ReciprocalContains reciprocal edges
Directed cyclesContains directed cycles
LoopsContains loops

Statistics

Size n =7,827,192
Volume m =65,905,159
Unique edge count m̿ =64,841,973
Loop count l =204,128
Wedge count s =157,977,425,565
Claw count z =4,934,112,356,635,505
Cross count x =1.732 27 × 1020
Triangle count t =109,502,442
Maximum degree dmax =204,292
Maximum outdegree d+max =5,393
Maximum indegree dmax =204,282
Average degree d =16.840 1
Fill p =1.058 39 × 10−6
Average edge multiplicity m̃ =1.016 40
Size of LCC N =5,499,664
Size of LSCC Ns =727,424
Relative size of LSCC Nrs =0.092 935 5
Diameter δ =31
50-Percentile effective diameter δ0.5 =4.021 08
90-Percentile effective diameter δ0.9 =5.186 31
Median distance δM =5
Mean distance δm =4.544 39
Gini coefficient G =0.761 952
Balanced inequality ratio P =0.207 882
Outdegree balanced inequality ratio P+ =0.294 511
Indegree balanced inequality ratio P =0.136 988
Relative edge distribution entropy Her =0.890 250
Power law exponent γ =1.666 25
Degree assortativity ρ =−0.026 920 4
Degree assortativity p-value pρ =0.000 00
Clustering coefficient c =0.002 079 46
Directed clustering coefficient c± =0.044 535 0
Spectral norm α =1,010.82
Operator 2-norm ν =1,009.34
Cyclic eigenvalue π =265.575
Reciprocity y =0.076 923 8
Non-bipartivity bA =0.002 588 92
Normalized non-bipartivity bN =0.001 534 53

Plots

Degree distribution

Cumulative degree distribution

Lorenz curve

Spectral distribution of the adjacency matrix

Spectral distribution of the normalized adjacency matrix

Spectral distribution of the Laplacian

Spectral graph drawing based on the adjacency matrix

Spectral graph drawing based on the normalized adjacency matrix

Degree assortativity

Hop distribution

In/outdegree scatter plot

Edge weight/multiplicity distribution

Clustering coefficient distribution

Matrix decompositions plots

Downloads

References

[1] Jérôme Kunegis. KONECT – The Koblenz Network Collection. In Proc. Int. Conf. on World Wide Web Companion, pages 1343–1350, 2013. [ http ]
[2] Xing Niu, Xinruo Sun, Haofen Wang, Shu Rong, Guilin Qi, and Yong Yu. Zhishi.me – weaving Chinese linking open data. In Proc. Int. Semant. Web Conf., pages 205–220, 2011.