8-1汪小帆QD_Conf
2008.7.30 青岛大学复杂网络论坛
Structure and Evolution ofSJTU Large Online Social Networks Xiaofan Wang
上海交通大学电子信息与电气工程学院
@ 2004 SJTU
Complex Networks & Control Lab
2002 IBM Corporation
SJTU
Outline
Social Network Services (SNS) Static Structure of Wealink(若邻) Network Evolution of Wealink Conclusion 海内存知己,天涯若比邻
@ 2004 SJTU
SJTU
Business 2.0 Magazine 2005
@ 2004 SJTU
SJTU
Social Networking Services
@ 2004 SJTU
SJTU
Social Networking Services
@ 2004 SJTU
SJTU
A twenty-first century science
If handled appropriately, data about Internet-based communication and interactivity could revolutionize our understanding of collective human behavior. D J Watts, NATURE, 1 February 2007
@ 2004 SJTU
SJTU
人肉搜索:让人欢喜让人忧
如果你爱他,把他放到人肉引擎上去,你很快就会知道他的一 切;如果你恨他,把他放到人肉引擎上去,因为那里是地狱
变传统的网络信息搜索为人找人、人问人、人碰人、人挤人、人 挨人的关系型网络社区活动
@ 2004 SJTU
SJTU
13亿人一起动手把她找出来
从声讨到寻找到被扣押
上网的IP地址 上网的具体地点 QQ号和QQ空间 里面存储的相关资料,包括年 龄、血型、居住地被公开 QQ密码被网友攻破,其同事 的QQ号也被查出,信息被公 布到“主战场”——天涯 更大的搜索网铺开,半小时不 到,有匿名网友发帖称得知该 女子的详细信息:1987年出 生,沈阳市苏家屯人,她的身 份证号码、家庭成员、具体地 址、工作地点,甚至父母亲和 哥哥的电话全被“挖”了出来
@ 2004 SJTU
SJTU
Wealink(若邻网) www.wealink.com
@ 2004 SJTU
SJTU
Data Collection 2005. 5.11-2007.8.22
Who? User ID With Who? Friend list When? Time
Nodes: 223, 624 Edges: 273, 395 Mean degree: 2.45 Max. degree: 1657
@ 2004 SJTU
SJTU
Connectivity of Wealink
Connected components: 1844 Mean size: 121.27 Nodes in LCC: 194304 (86.89%) Nodes: 223, 624 Edges: 273, 395
@ 2004 SJTU
SJTU
Size Distribution for Connected Components
10
3
S=31 10 N(S)
2
S=61
10
1
LCC
10 0 10
0
10
1
10
2
10 S
3
10
4
10
5
10
6
There are many groups that comprise about 30 persons and are almost fully connected
@ 2004 SJTU
SJTU
Topological Properties of the LCC
Node number: 194, 304 Edge number: 245, 669 Network density: 1.3014e-5 Average degree: 2.53 Average shortest path length: 7.45 Network diameter: 24 Clustering coefficient: 0.0257 Degree assortativity: -0.0822 Modularity Q: 0.8957
@ 2004 SJTU
SJTU
Degree Distribution of LCC
10
6 5
10
10 N(k)
4
Slope=-1.74 10
3
k=31 k=62 k=91
10
2
10
1
10 0 10
0
10
1
10 k
2
10
3
10
4
Groups comprising 30 persons lead to the peaks for degrees of about integer times of 30.
@ 2004 SJTU
SJTU
A Model Reproducing the Degree Distribution
Step 1: Generating a scale-free network with N no
des by BA model; Step 2: In the network randomly selecting M (M
10 10 10 N be um r 10 10 10
6
10000
5
Wealink
1000 N(k) 100
k=32
Model
k=61
4
3
2
10
1
10 0 10
0
10
1
10 Degree
2
10
3
10
4
1 0 10
10
1
10 k
2
10
3
N=10000, M=30, G=50
@ 2004 SJTU
SJTU
Clustering Coefficient of LCC
10
0
10
-1
10
-2
0.0257
10
-3
10
-4
~ k −1
0
10
-5
10
-6
10
10
1
10 k
2
10
3
The hierarchical structure of Wealink results from the group organizations.
@ 2004 SJTU
SJTU
Shortest Path Length Distribution of LCC
0.25
0.2
0.15 P(l) 0.1 0.05 0 0
5
10 l
15
20
25
It can be well fitted by the trial function
a ≈ 4.76 × 10-5 ,
@ 2004 SJTU
P(l ) = ale
c ≈ 1.92
− bl 2 + cl
b ≈ 0.14,
Xu X P, Hu J H, Liu F. CHAOS, 2007, 17, 023111.
SJTU
Degree Disassortativity of LCC
r=-0.0822
140 120 100 80 60 40 20 0 0 200 400 600 800 k 1000 1200 1400 1600 1800
is mean degree of neighbor nodes of nodes with degree k
Newman M E J, Park J. Phys. Rev. E, 2003, 68, 036122.
@ 2004 SJTU
SJTU
Degree Assortativity of Real Social Networks
Most real social networks exhibit an assortative mixing pattern.
@ 2004 SJTU
SJTU
Degree Disassortativity of Online Social Networks
Many online social networks show a disassortative mixing pattern.
@ 2004 SJTU
SJTU
Real life
Elites
Social boundary
Layfolks
Online
Elites Layfolks
@ 2004 SJTU
SJTU
Betweenness Centrality of LCC
Correlation between node BC and degree in LCC. Some nodes with small degree have large BC because the nodes connect several communities in the network and act as bridges among the communities.
@ 2004 SJTU
SJTU
Community Structure Dis. of LCC
10
2
S=31
S=62 N(S) 10
1
S=92
10 0 10
0
10
1
10
2
10 S
3
10
4
10
5
The groups comprising 30 persons result in the peaks for community sizes of about integer times of 30.
Clauset A, Newman M E J, Moore C. Phys. Rev. E, 2004, 70, 066111.
@ 2004 SJTU
SJTU
Outline
Wealink Static Structure Network Evolution Conclusion
@ 2004 SJTU
SJTU
Evolution of Network Size
300000 0.0040 250000 0.0035 0.0030 200000
250000
0.012 0.010 0.008
200000
N(T), E(T)
0.0025
N(T), E(T)
d(T)
150000 0.006 100000 0.004
d(T)
150000
0.0020 0.0015
100000
50000
N(T) E(T) d(T)
0.0010
50000
0.0005 0.0000 -0.0005 30
N(T) E(T) d(T)
0.002 0.000
0 0 5 10 15 20 25
0 0 5 10 15 20 25
30
T(Month)
T(Month)
Evolution of the numbers of nodes N(T) and edges E(T), and network density d(T) for the whole network (left) and the LCC (right) from June 11th 2005 to August 11th 2007.
@ 2004 SJTU
SJTU
S Shape and Bass Diffusion Model
300000 0.0040 250000 0.0035 0.0030 200000
N(T), E(T)
0.0025
d(T)
150000
0.0020 0.0015
100000
50000
N(T) E(T) d(T)
0.0010 0.0005 0.0000 -0.0005 30
Innovators
0 0 5 10 15 20 25
T(Month)
Mass media and advertisements
A Generic Characteristic for Online Social Networks?
Bass,
F.M., 1969. Management Science 15(5), 215-227. Dodds, P.S., Watts, D.J., 2004. Physical Review Letters 92 (Art. No. 218701). Watts, D.J., Dodds, P.S., 2007. Journal of Consumer Research 34, 441-458.
@ 2004 SJTU
Imitators
SJTU
Bass Diffusion Model
m : total number of users during the entire diffusion period S(T) : new nodes (users) or edges (friend relations) at T Y(T) : total number of nodes or edges in the network at T
S(T)
S (T ) =
T*
Y(T)
Time
me − (T − a )/ s s ⎡1 + e ⎣
− (T − a )/ s
⎤ ⎦
2
m
Point of inflection
m Y (T ) = , − (T − a ) s 1+ e
T*
@ 2004 SJTU
Time
SJTU
2.5 x 10
5
Scale growth of Wealink
Whole network: • node number • edge number(b)
25 30
2
Nnode(T)
1.5
1
(a)
0.5
0
5
10
15 T(Month)
20
3
x 10
5
2.5
2 Nedge(T)
(c)
1.5
1
0.5
Circles: total number of nodes/edges at T Crosses: number of new nodes/edges at T (d) Solid lines: fitted curves by Bass model
0
5
10
15 T(Month)
20
25
30
@ 2004 SJTU
SJTU
3 x 10
5
2.5
2 Nedge(T)
1.5
1
(c)
0.5
LCC: • node number • edge number
25 30
0
5
10
15 T(Month)
20
2.5
x 10
5
2
Nedge(T)
1.5
(d)
1
0.5
Circles: total number of nodes/edges at T Crosses: number of new nodes/edges at T Solid lines: fitted curves by Bass model
0 0
5
10
15 T(Month)
20
25
30
@ 2004 SJTU
SJTU
Evolution of Average Path Length and Diameter
26 24 22 0.012 0.010
l(T), D(T)
20 18 16 14 12 10 8 6 0 5 10 15 20 25 0.000 30 0.004 0.002
l(T) D(T) d(T)
0.008 0.006
d(T)
T(Month)
Time evolution of l(T) and D(T) for the LCC in comparison with network density d(T).
@ 2004 SJTU
SJTU
Evolution of Clustering Coefficient
0.012 0.010 0.008
d(T) C(T)
0.30 0.25 0.20 0.15 0.10
C(T)
d(T)
0.006 0.004 0.002
0.05 0.000 0.00
0 5 10 15 20 25 30
T(Month)
C(T) for the LCC over time in comparison with network density d(T). They show strong positive correlation.
@ 2004 SJTU
SJTU
Evolution of Degree Assortativity
0.4
Assortativity
r(T)
0.3 0.2 0.1 0.0 -0.1 -0.2 0 5 10 15 20 25 30
Disassortativity
T(Month)
A generic characteristic of online social networks?
@ 2004 SJTU
SJTU
Mechanism for the Assortativity-Disassortativity Transition
Elites
Social boundary
Layfolks
Elites
Layfolks
@ 2004 SJTU
SJTU
Conclusions
Static Structural Properties of Wealink
Small world Hierarchical structure
N(k) 10
6
10
5
10
4
Scale-free Large clustering coefficient Degree disassortativity Strong modularity
Slope=-1.74 10
3
k=31 k=62 k=91
10
2
10
1
10 0 10
0
10
1
10 k
2
10
3
10
4
Peaks in the distributions of connected component size, degree and community size
@ 2004 SJTU
SJTU
Structural Evolution of Wealink
The S-shaped growth may be a generic feature for online social networks. The network underwent a transition from degree assortativity characteristic of realistic social networks to degree disassortativity characteristic of many virtual online social networks.
0.4 0.3 0.2 0.1
r(T)
0.0 -0.1 -0.2 0 5 10 15 20 25 30
T(Month)
@ 2004 SJTU
SJTU
Multidimensional Networks in Web 2.0 Multiple Types of Nodes and Multiple Types of Relationships
@ 2004 SJTU
SJTU
Thank You!
上海交通大学电子信息与电气工程学院
@ 2004 SJTU