Enron (clean)

The Enron email network consists of 1,148,072 emails sent between employees of Enron between 1999 and 2003. Nodes in the network are individual employees and edges are individual emails. It is possible to send an email to oneself, and thus this network contains loops. In this version of the network, nodes with erroneous timestamps were removed.

Metadata

CodeEN2
Internal nameenron-rm
NameEnron (clean)
Data sourcehttp://www.cs.cmu.edu/~enron/
AvailabilityDataset is not available for download
Consistency checkDataset passed all tests
Category
Communication network
Node meaningUser
Edge meaningEmail
Network formatUnipartite, directed
Edge typeUnweighted, multiple edges
Temporal data Edges are annotated with timestamps
ReciprocalContains reciprocal edges
Directed cyclesContains directed cycles
LoopsContains loops

Statistics

Size n =87,273
Volume m =1,147,126
Unique edge count m̿ =321,288
Loop count l =13,080
Wedge count s =49,130,407
Claw count z =13,799,850,890
Cross count x =3,784,732,758,298
Triangle count t =1,178,289
Square count q =92,655,137
4-Tour count T4 =938,356,410
Maximum degree dmax =38,778
Maximum outdegree d+max =32,613
Maximum indegree dmax =6,165
Average degree d =26.288 2
Fill p =4.234 95 × 10−5
Average edge multiplicity m̃ =3.570 40
Size of LCC N =84,220
Size of LSCC Ns =9,160
Relative size of LSCC Nrs =0.104 958
Diameter δ =14
50-Percentile effective diameter δ0.5 =4.383 96
90-Percentile effective diameter δ0.9 =5.794 71
Median distance δM =5
Mean distance δm =4.890 80
Gini coefficient G =0.907 479
Balanced inequality ratio P =0.100 665
Outdegree balanced inequality ratio P+ =0.106 781
Indegree balanced inequality ratio P =0.134 847
Relative edge distribution entropy Her =0.827 659
Power law exponent γ =2.650 47
Tail power law exponent γt =1.761 00
Degree assortativity ρ =−0.166 816
Degree assortativity p-value pρ =0.000 00
In/outdegree correlation ρ± =+0.403 460
Clustering coefficient c =0.071 948 7
Directed clustering coefficient c± =0.111 155
Spectral norm α =7,808.15
Operator 2-norm ν =3,904.14
Cyclic eigenvalue π =3,904.00
Algebraic connectivity a =0.004 229 27
Reciprocity y =0.146 678
Non-bipartivity bA =0.622 351
Normalized non-bipartivity bN =0.001 262 22
Spectral bipartite frustration bK =8.929 14 × 10−5

Plots

Degree distribution

Cumulative degree distribution

Lorenz curve

Spectral distribution of the adjacency matrix

Spectral distribution of the normalized adjacency matrix

Spectral distribution of the Laplacian

Spectral graph drawing based on the adjacency matrix

Spectral graph drawing based on the Laplacian

Spectral graph drawing based on the normalized adjacency matrix

Degree assortativity

Zipf plot

Hop distribution

In/outdegree scatter plot

Edge weight/multiplicity distribution

Clustering coefficient distribution

Average neighbor degree distribution

Temporal distribution

Temporal hop distribution

Diameter/density evolution

SynGraphy

Inter-event distribution

Node-level inter-event distribution

Matrix decompositions plots

Downloads

References

[1] Jérôme Kunegis. KONECT – The Koblenz Network Collection. In Proc. Int. Conf. on World Wide Web Companion, pages 1343–1350, 2013. [ http ]
[2] Bryan Klimt and Yiming Yang. The Enron corpus: A new dataset for email classification research. In Proc. Eur. Conf. on Mach. Learn., pages 217–226, 2004.