# ISN: Communities and cliques

Dyads are not yet interesting for network research. However, starting at triads interesting behaviour appear. In triads, balance and control appear. Triads appear more commonly in social networks than in random graphs.

Clustering coeffcient

The clustering coefficient measures the amount of transitivity in a network. When A is related to B, and B is in turn related to C, then A is also related to C. The index ranges from 0 to 1. In social networks it is usually between 0.3 and 0.6.

Four patterns are counted between all vertices. Empty (three vertices with no connections), 1-Edge (two vertices are connected via an edge and one vertex is not), 2-Edge (One vertex is connected to both other vertices with an edge) and Triangle (all vertices are connected).

Groups of more than 3

There are a few technical description that are extensions (and modifications) of the complete for more than 3 nodes:

• Clique: Maximal complete subgraph of $n \geq 3$
• k-cliques: Relaxation to $k > 1$, where $k$ is geodesic length
• k-core: Subgraph where each node adjacent to at least $k$ other nodes
• k-plex: Maximal subgraph of $g$ nodes where each node adjacent to no fewer than $g-k$ nodes

Communities

Communities are densely connected within and sparsely connected with others. A community structure can affect individuals, groups, networks and give insights into how social systems work.

Community detection

Community detection is a computationally difficult problem. Knowing the optimal solution is not always possible. Algorithmic approximations are often used to detect communities.

Modularity

Modularity is always smaller than 1, but can also take negative values. Higher values means more edges within modules.

$Q = \frac{1}{2m}\sum_{ij}\delta(C_i,C_j)(A_{ij}-P_{ij})$

Where $A_{ij} encodes that an edge exists and [latex]P_{ij}$ the probability of an edge existing and $m$ the number of edges and $\delta(C_i,C_j)$ describes whether two nodes are inside the same module.

Kernighan-Lin Algorithm

Based on a pre-determined number of communities is randomly assigned and the modularity score is computed for switching any node. The highest achievable modularity with a single switch is assigned. The process is repeated until no more switches could improve the score. The solution, however, is not necessarily optimal, a local maxima may be chosen based on the random initial assignment. The algorithm should be repeated

Edge-Betweenness Clustering Algorithm

Evaluate the edge-betweenness of each edge in the network. Find the edge with the highest score and delete it. As long as the disconnection between two components increases modularity, the algorithm continues. While there is no random variation involved, it may not find the optimal solution, it may not maximise modularity and modularity is slow.

Fast-Greedy Clustering Algorithm

Starting with an empty graph where each node is its own community. The modularity for each possibly join between two nodes is computed and the one with the highest modularity is chosen. The process is repeated until no further increase in modularity is possible. An issue is, that small communities are easily missed. However, a dendrogram allows to judge how many communities could be present.

# ISN: Data Collection Strategies

Data collection refers to the collection of an offline social network. The information about a particular community is collect. A group needs to be defined (boundaries), which may be easy (e.g. school class or company) or difficult (e.g. needle-sharing).

Complete network data

A group with clear boundaries, such as a formal group or organisation. All information is collected, either by a roster (e.g. class list) or by a name generator (e.g. each person lists their contacts).

Snowball-sampled network

A population of unknown size or unclear boundaries. A step-wise sampling technique is applied to reveal larger parts of the network until the sample is large enough.

Ego-centered network data

Samples of individuals and their personal relationships structure. For instance, a person mentions their friends (ego-alter relations) and optionally the relation amongst them (or even others; alter-alter relations) .

Informed consent and ethics

For any data collection, the individuals need to be informed about the goals of the study and must be able to withdraw. A participant must be aware that she/he is studied. The data collected furthermore must be anonymous. This is increasingly difficult in social network analysis as the names of people are intrinsic to the analysis. Keeping the personal data secure and separate from results.

# ISN: Positions in Social Networks

Positions in a network are important for different reasons such as well-being. In the following several concepts will be introduced to gauge positions in a social networks.

Structural balance

People prefer balanced relationship structures. According to Heider (Heider, 1946), imbalances cause psychological distress. To balance people create or drop ties. However, balance may not be equally important.

Structural holes

A structural hole means being between two other actors with the only transitive connection between them passing through one (Burt, 2009). In a sturctural hole, one is exposed to different views. However, network brokerage is a probability and may not guarantee advantages.

Embeddedness of ties

A tie embedded in a triad with two additional strong ties is called a Simmelian tie. They are supposed to be more powerful as they enforce solidarity and protect from malfeasance.

Social capital

In general, a transferable capital that is inherent to the connections between people (Bourdieu, 1986; Coleman, 1988).

Safety and Effectance

Safety includes fulfilment of emotional needs such as trust and reputation (e.g. embeddedness. Effectance on the other hand means to learn new things and being autonomous (e.g. brokerage through structural holes).

The ties that torture

Being in a structural hole between two Simmelian ties means to have to uphold two different social constraints at once that may even be contradictory (Krackhardt, 1999).

The strength of the weak tie

Week ties connect one to networks with different information that allows one to acquire new knowledge (Granovetter, 1973).

Embeddedness of economic actions

Economic actions are embedded in social relations. This, the constrained options of actors to engage in interaction are taken into account. The action between economic actors depend on the type, strength and embeddedness of a relationship.

## References

Bourdieu, P. (1986). The Forms of Capital. In J. G. Richardson (Ed.), Handbook of Theory and Research for the Sociology of Education (pp. 241–58). Greenwood Press.
Burt, R. S. (2009). Structural holes: The social structure of competition. Harvard university press.
Coleman, J. S. (1988). Social capital in the creation of human capital. American Journal of Sociology, 94, 95–120.
Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380.
Heider, F. (1946). Attitudes and cognitive organization. The Journal of Psychology, 21(1), 107–112.
Krackhardt, D. (1999). The ties that torture: Simmelian tie analysis in organizations. Research in the Sociology of Organizations, 16(1), 183–210.

# ISN: Network visualisation

Today’s topic will be to visualise networks and centrality measures. We visualise a network to better understand the underlying data. A visualisation should be driven by the question that we would like to answer. Nonetheless, visualisations are by their nature exploratory. Also, visualisations do not provide evidence for hypothesis.

Visualisation usually tries to convey information by the layout. Density tries to convey cohesion. Distance tries to convey graph-theoretic distance, tie length tries to convey attached values. Geometric symmetries try to convey structural symmetries.

General rules of graph visualisation is that no edge crossing, overlap, asymmetry or meaningless edge ledge/node side should occur.

Visualisation in R

We will use either the “igraph” or “sna” library to visualise the data.

# ISN: What are Social Networks?

Social networks are based on relations between two or a few individuals from friendships over contracts to work contacts.

Throughout the course, the theory behind social networks will be put into context with methods of comparing and applying social networks. Examples from different scientific disciplines will be used to illustrate the social networks.

Network descriptives

Mathematical descriptions of networks are a useful descriptive. An adjacency matrix can be used to represent a graph as nodes and edges.

Networks can be analysed on different levels:

• Dyad level ($O(n^2)$) or connections between nodes
• Node level ($(O(n)$) or properties of nodes
• Network level ($(O(1)$) or clustering of nodes.

Centrality could be access to resources, connection between parts, part of interaction. For a detailed report on centrality measures, look at this post in my Complexity and Global Systems Sciences lecture notes. Centrality measures often differ and in larger networks will be different for different measures. The choice of centrality is dependent on the research question.

Methods

Generally, for any network, one should start with the following descriptives, before continuing to more advanced analysis.

2. Compute density of network (number of edges divided by maximal number of edges; note that the maximal number is different for directed ( $e_{max} = n(n-1)$) and undirected ($e_{max} = n(n-1)/2$) graphs).