SMABSC: Disease Propagation

The SIR model was introduced as a mathematical model with differential equations (Kermack & McKendrick, 1927). The basic states are Susceptible, Infected, and Recovered.

N_i = \frac{dS}{dt}+\frac{di}{dt}+\frac{dR}{dt}

In the SIR model, the fundamental trajectory of disease propagation could be captured, immunity was acquired after disease and the population is homogeneous.

But the SIR model has short-comings:

  • populations are not infinite and increase/decrease over time,
  • populations are spatial objects and have (voluntary) spatial interactions,
  • populations are heterogeneous, including isolated sub-populations with irregular interactions as well as significant distances.
  • populations are driven by endogenous social factors and constrained by exogenous environmental circumscription.

Additional states were introduces such as Exposed (i.e. dormant infections that does not infect others yet) and Maternal immunity (i.e. individuals that cannot be infected). The order has been rearranged such as SIS, SEIS, SIRS.

These models still did not address explicit spatial reference to population loci & transportation networks, distribution of social & medical information (temporal and spatial), mechanisms to simulate voluntary & forced quarantines, treatment options and their delivery (temporal and spatial), and characteristics of pathogens and disease vector.

Questions that needs to be answered about a model is the form of circumscription (Carneiro, 1961, 1987, 1988) (e.g. social and environmental forces), instantiation topologies (i.e. abstract or logical relationships, social networks, or space), activation schemes (e.g. random, uniform random, Poisson), and encapsulation.


Carneiro, R. L. (1961). Slash-and-burn cultivation among the Kuikuru and it implications for cultural development in the Amazon Basin. In J. Wilbert (Ed.), The evolution of horticultural systems in native South America, causes and consequences. “Antropológica” (pp. 47–67). Caracas, Venezuela: Editorial Sucre.
Carneiro, R. L. (1987). Further reflections on resource concentration and its role in the rise of the state. In L. Manzanilla (Ed.), Studies in the Neolithic and Urban Revolutions (pp. 245–260). Oxford, UK: Archaeopress.
Carneiro, R. L. (1988). The Circumscription Theory: Challenge and response. The American Behavioral Scientist, 31(4), 497–511.
Kermack, W. O., & McKendrick, A. G. (1927). A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 115(772).


ISN: Communities and cliques

Dyads are not yet interesting for network research. However, starting at triads interesting behaviour appear. In triads, balance and control appear. Triads appear more commonly in social networks than in random graphs.

Clustering coeffcient

The clustering coefficient measures the amount of transitivity in a network. When A is related to B, and B is in turn related to C, then A is also related to C. The index ranges from 0 to 1. In social networks it is usually between 0.3 and 0.6.

Triad counts

Four patterns are counted between all vertices. Empty (three vertices with no connections), 1-Edge (two vertices are connected via an edge and one vertex is not), 2-Edge (One vertex is connected to both other vertices with an edge) and Triangle (all vertices are connected).

Groups of more than 3

There are a few technical description that are extensions (and modifications) of the complete for more than 3 nodes:

  • Clique: Maximal complete subgraph of n \geq 3
  • k-cliques: Relaxation to k > 1, where k is geodesic length
  • k-core: Subgraph where each node adjacent to at least k other nodes
  • k-plex: Maximal subgraph of g nodes where each node adjacent to no fewer than g-k nodes


Communities are densely connected within and sparsely connected with others. A community structure can affect individuals, groups, networks and give insights into how social systems work.

Community detection

Community detection is a computationally difficult problem. Knowing the optimal solution is not always possible. Algorithmic approximations are often used to detect communities.


Modularity is always smaller than 1, but can also take negative values. Higher values means more edges within modules.

Q = \frac{1}{2m}\sum_{ij}\delta(C_i,C_j)(A_{ij}-P_{ij})

Where A_{ij} encodes that an edge exists and [latex]P_{ij} the probability of an edge existing and m the number of edges and \delta(C_i,C_j) describes whether two nodes are inside the same module.

Kernighan-Lin Algorithm

Based on a pre-determined number of communities is randomly assigned and the modularity score is computed for switching any node. The highest achievable modularity with a single switch is assigned. The process is repeated until no more switches could improve the score. The solution, however, is not necessarily optimal, a local maxima may be chosen based on the random initial assignment. The algorithm should be repeated

Edge-Betweenness Clustering Algorithm

Evaluate the edge-betweenness of each edge in the network. Find the edge with the highest score and delete it. As long as the disconnection between two components increases modularity, the algorithm continues. While there is no random variation involved, it may not find the optimal solution, it may not maximise modularity and modularity is slow.

Fast-Greedy Clustering Algorithm

Starting with an empty graph where each node is its own community. The modularity for each possibly join between two nodes is computed and the one with the highest modularity is chosen. The process is repeated until no further increase in modularity is possible. An issue is, that small communities are easily missed. However, a dendrogram allows to judge how many communities could be present.




PE: Redistribution

The focus of today’s lecture will be on redistribution as discussed in Chapter 3(Mueller, 2003). Additionally, we will discuss papers quantitatively assessing the situation (De Haan & Sturm , 2017; Sturm & de Haan, 2015).

A justification for the state can be redistribution. But redistribution itself can be argued for based on different reasons. In this post, we will illuminate the main arguments. First three voluntary redistribution arguments will be covered, then we will have a look at involuntarily redistribution.

Redistribution as insurance

If one assumes Rawls’ veil of ignorance (Rawls, 2009), redistribution can be seen as an insurance against the uncertainties of what kind of role one will assume in society. Insurance can be covered privately, so at first state intervention may seem inadequate. However, since people can assess their risk, high-risk individuals would select the insurance whereas low-risk individuals would shun the insurance. To overcome the issue of adverse selection, public insurance is introduced. The issue of adverse selection has been introduced by Akerlof (Akerlof, 1970) and shows that information asymmetry can break markets. The public insurance overcomes this issue by forcing a pareto-optimum on a societal level. Typical cases for this are health care insurance, unemployment insurance, and retirement insurance.

Redistribution as public good

Another justification comes from altruism or empathy (“warm glow”). The utility equation is expanded to max U_m + \alphaU_o where 0\leq\alpha\leq1.

Redistribution as fairness norm

The assumption that fairness is an important norm, is the basis for this redistribution argument. The classical example is the dictator game, where anonymous individuals are paired and one gets an amount of money and may share it with the other. Usually, any individual share around 30% with the other despite being able to keep everything and not knowing anything about the other. So far, the assumption is that the random element of the game let people share their gain because they also could have ended up on the other side.

Redistribution as allocative efficiency

If two individuals (P and U) work a fixed amount of land. The productivity of P is 100 whereas U‘ productivity is 50. The connecting curve describes the production possibility frontier. Any initial allocation (e.g. A may not be optimal on a societal level (i.e. A is not tangential on a 45° line), the societal optimum would be in B, which is however unacceptable for U. The inefficient allocation would end up at A'. The state could either redistribute land to reach B or production to reach C. Note that C in the graph should amount to a value above 100. Alternatively, private contracting could reach the same result given that the state enforces property rights and contracts.

The example is based on (Bös & Kolmar, 2003).

Redistribution as taking

Groups can lobby to increase their utility U by increasing their income Y based on their political resources R available. However, if two antagonistic groups lobby their policies may cancel each other leaving them only with the additional cost of lobbying without any gains.

Measuring redistribution

To measure redistribution, inequality needs to be measured first. A typical measure of inequality is done via the Lorenz curves and the Gini coefficient (Gini, 1912). The Gini coeffcient is the ratio of areas under two curves. The Gini market coefficient (before taxes) and the Gini net coefficient (after taxes and subsidies) are subdivisions that taken at ratio help to assess redistribution.

The causation of inequality is difficult to assess. Some argue for politics (Stiglitz, 2014), whereas others argue for the market-based economies (Muller, 2013). A new line of inquiry attributes inequality to ethno-linguistic fractionalisation reducing the interest in redistribution (Desmet, Ortuño-Ortín, & Wacziarg, 2012).

Sturm  and de Haan (Sturm & de Haan, 2015) follow up on the argument and examine the relationship between capitalism and income inequality. A large sample of countries is analysed using an adjusted economic freedom (EF) index as proxy for capitalism and Gini coefficients as proxy for income inequality. Additionally, they analyse the relation between income inequality and fractionalistion given similar capitalist systems. For the first analysis, there is no conclusive evidence that capitalism and income inequality are linked. However, if fractionalisation is taken into account, than inequality can be explained based on the level of fractionalisation. The more fractionalised a society is, the less redistribution takes place and consequently inequality remains high.

In a second paper de Haan and Sturm (De Haan & Sturm , 2017) analyse how the financial development impacts income inequality. Previous research on financial development, financial liberalisation and banking crises (theoretical and empirical) has been ambiguous. TBC.


Akerlof, G. A. (1970). “The market for” lemons”: Quality uncertainty and the market mechanism. The Quarterly Journal of Economics, 488–500.
Bös, D., & Kolmar, M. (2003). Anarchy, efficiency, and redistribution. Journal of Public Economics, 87(11), 2431–2457.
De Haan, J., & Sturm , J.-E. (2017). Finance and Income Inequality: A review and new evidence. European Journal of Political Economy, Forthcoming.
Desmet, K., Ortuño-Ortín, I., & Wacziarg, R. (2012). The  Political Economy of Linguistic Cleavages. Journal of Development Economics, 97(2), 322–338.
Gini, C. (1912). Variabilità e mutabilità. In E. Pizetti & T. Salvemini (Eds.), Memorie di metodologica statistica (p. 1). Rome: Libreria Eredi Virgilio Veschi.
Mueller, D. C. (2003). Public Choice III. Cambridge, UK: Cambridge University Press.
Muller, J. Z. (2013). Capitalism and inequality: What the right and the left get wrong. Foreign Affairs, 92(2), 30–51.
Rawls, J. (2009). A theory of justice. Harvard university press.
Stiglitz, J. (2014). Inequality is not inevitable. New York Times, pp. 1–2.
Sturm, S., Jan-Egbert, & de Haan, J. (2015). Income Inequality, Capitalism and Ethno- Linguistic Fractionalization. American Economic Review: Papers and Proceedings, 105(5), 593–597.


ISN: Data Collection Strategies

Data collection refers to the collection of an offline social network. The information about a particular community is collect. A group needs to be defined (boundaries), which may be easy (e.g. school class or company) or difficult (e.g. needle-sharing).

Complete network data

A group with clear boundaries, such as a formal group or organisation. All information is collected, either by a roster (e.g. class list) or by a name generator (e.g. each person lists their contacts).

Snowball-sampled network

A population of unknown size or unclear boundaries. A step-wise sampling technique is applied to reveal larger parts of the network until the sample is large enough.

Ego-centered network data

Samples of individuals and their personal relationships structure. For instance, a person mentions their friends (ego-alter relations) and optionally the relation amongst them (or even others; alter-alter relations) .

Informed consent and ethics

For any data collection, the individuals need to be informed about the goals of the study and must be able to withdraw. A participant must be aware that she/he is studied. The data collected furthermore must be anonymous. This is increasingly difficult in social network analysis as the names of people are intrinsic to the analysis. Keeping the personal data secure and separate from results.


ISN: Positions in Social Networks

Positions in a network are important for different reasons such as well-being. In the following several concepts will be introduced to gauge positions in a social networks.

Structural balance

People prefer balanced relationship structures. According to Heider (Heider, 1946), imbalances cause psychological distress. To balance people create or drop ties. However, balance may not be equally important.

Structural holes

A structural hole means being between two other actors with the only transitive connection between them passing through one (Burt, 2009). In a sturctural hole, one is exposed to different views. However, network brokerage is a probability and may not guarantee advantages.

Embeddedness of ties

A tie embedded in a triad with two additional strong ties is called a Simmelian tie. They are supposed to be more powerful as they enforce solidarity and protect from malfeasance.

Social capital

In general, a transferable capital that is inherent to the connections between people (Bourdieu, 1986; Coleman, 1988).

Safety and Effectance

Safety includes fulfilment of emotional needs such as trust and reputation (e.g. embeddedness. Effectance on the other hand means to learn new things and being autonomous (e.g. brokerage through structural holes).

The ties that torture

Being in a structural hole between two Simmelian ties means to have to uphold two different social constraints at once that may even be contradictory (Krackhardt, 1999).

The strength of the weak tie

Week ties connect one to networks with different information that allows one to acquire new knowledge (Granovetter, 1973).

Embeddedness of economic actions

Economic actions are embedded in social relations. This, the constrained options of actors to engage in interaction are taken into account. The action between economic actors depend on the type, strength and embeddedness of a relationship.


Bourdieu, P. (1986). The Forms of Capital. In J. G. Richardson (Ed.), Handbook of Theory and Research for the Sociology of Education (pp. 241–58). Greenwood Press.
Burt, R. S. (2009). Structural holes: The social structure of competition. Harvard university press.
Coleman, J. S. (1988). Social capital in the creation of human capital. American Journal of Sociology, 94, 95–120.
Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380.
Heider, F. (1946). Attitudes and cognitive organization. The Journal of Psychology, 21(1), 107–112.
Krackhardt, D. (1999). The ties that torture: Simmelian tie analysis in organizations. Research in the Sociology of Organizations, 16(1), 183–210.


PE: Bureaucracy theory

We will be looking at the political process in an exogenous political environment. Policies are demanded for by citizens/voters and interest groups whereas it is supplied by delegates/representatives/politicians and public administration. Public administration is claimed to be motived by either rent-seeking or community-engaging.

A central question becomes how to measure the quality of government of public services. Four typical approaches are surveys, input-output comparison, efficiency frontier (how well/efficient do they perform compared to an optimum), and efficiency and effectiveness measures.

Economic Theory of Bureaucracy

A homo oeconomicus optimises his/her own utility. Whereas an entrepreneur simply optimises profit, a bureaucrat is forced to pursue self-interest within the institutional constraints often excluding economic profit. Max Weber (Weber, 2002) argues that a natural object of a bureaucrat is power. Russel (Russell, 2004) offers three subdivisions: direct physical power, rewards and punishment, and influence on opinion. Only under uncertainty a potential arises to exert the last type of power, whereas information creates the opportunity to actually do so.

Power allows for personal advantages and bureaucrats accrue non-monetary benefits, job security, jobs for relatives, additional benefits, work efforts, reputation, and more. To justify/hide these advantages, nonpecuniary goals of a bureaucrat become size of the bureaucracy, slack within the bureaucracy and risk-aversion.

To analyse bureaucracy usually a two actor environment is observed. A sponsor how delegates a task and a bureau that executes and delivers results. Issues arise due to conflicting interests (providing public good versus personal advantages) and information asymmetry (true cost is only known to the bureau).  Subsequently, measuring issues  create a monitoring problem. Usually, only the activity of a bureau can be observed, but not the output (e.g. national defence, education). As the sponsor is monopsonist buyer (on behalf of society) and the bureau is a monopolistic supplier (to circumvent wasteful duplication), efficiency is not required and information cannot be sourced from an alternative.

Observations in reality have shown that bureaucrats wages are unrelated to efficiency and that any form of performance dependent pay would be hard to measure.

Budget-maximising bureaucrat

Proposed by Niskanen (Niskanen, 1971) , it constitutes a simple model that assumes:

  1. Personal interests of bureaucrats are followed by maximising the budget
  2. The bureau has a monopoly position
  3. The cost function is not known by the sponsor
  4. The bureau can make all-or-nothing budget proposals

It follows that their budget B depends on the perceived output level Q (B=P(Q); B'>0; B''\leq 0) and that their costs C  depend on the output level Q (C=C(Q); C'>0; C''\geq 0).

A bureaucrat then has a Lagrangian objective function O_B=B(Q)+\lambda(B(Q)-C(Q)) where the Lagrangian multiplier \lambda represents the marginal utility of an expansion of the budget constraints to the bureau and hence is positive. When differentiated and solved for zero one obtains B'(Q)=\frac{\lambda}{1+\lambda}C'(Q) and B(Q)=C(Q). The optimal outcome for society would be B'(Q)=C'(Q).

The bureaucrat can know the sponsors social surplus and compute and request a budget that reduces the surplus to zero, unless B'\leq0. The additional condition explains a bureau’s infringement into other domains to justify increasing the budget.

Alternative institutional assumptions

Any of the four assumptions can be relaxed. Relaxing assumption 4, a sponsor could require more than one budget proposal with different levels of activities, thereby weakening the agenda-setting role of bureaucrats. Bureaucrats must announce the price P at which  it will supply  a level Q that will be subsequently set by the sponsor. A bureau now chooses P that maximises B. The demand elasticity \eta=\frac{P}{Q}\frac{dQ}{dP} can be used by the bureaucrat to choose the largest budget. Under the assumption of linear demand schedule, constant marginal costs, and known sponsor demand, a bureau will ask for a price P such that \eta=1 as long as this is higher  than its marginal costs.

Other assumptions can be challenged as well. Monitoring can be introduced (relaxing assumption 3) where investigative bodies sift through the expenses of public bodies and eventually unveil oversized budgets. Risk-aversion has been ignored so far. The sponsor could conceal its demand or a market for a service (i.e. competing bureaus) could be introduced.

Alternative behavioural assumptions

On the one hand, a slack-maximising bureaucrat wants to maximise x-inefficiency to increase his/her gain relative to the service provided. The minimally accepted service is provided at the indifference optimum below the social optimum such that a larger share of the budget can be acquired. On the other hand, a risk-averse bureaucrat would choose actions with the lowest potential penalties. Therefore, avoidance of action often occurs.

Other behavioural considerations that influence a bureaucrats behaviour can be crowding-out effects of intrinsic motivation by monitoring, social norms, availability bias (work only on salient risks), and lacking feedback-mechanisms.

Counter arguments

Promotions in bureaucracies are highly competitive and require good “track records” of the respective bureaucrat, so there is a market for bureaucrats. The discretionary power is actually lower than in the private sector and therefore inefficiency may not be properly judged. Lastly, (democratic) government are under (re-election) pressure and therefore are keen on monitoring the efficiency of a bureaucracy.

Power of the agenda setter

The agenda setter can prepare a choice where the non-acceptance of an oversized budget would result in the under-provision of a service. The ability to setup the agenda therefore results in power.

Control of the public sector

To reduce excesses in the public sector, usually, some form of competition is introduced (e.g. between administrative units, by private services), tightening of budget constraints (e.g. designated taxes (earmarking), limited tax base, and auditing), political restrictions (e.g. direct votes, separation of administration and politics), and rewards and punishment.


Niskanen, W. A. (1971). Bureaucracy and representative government. Transaction Publishers.
Russell, B. (2004). Power: A new social analysis. Psychology Press.
Weber, M. (2002). Wirtschaft und gesellschaft: Grundriss der verstehenden Soziologie. Mohr Siebeck.


IAP: Netheads and Bellheads

In the 1990s the great debates on how the Internet should be developed was coined the Netheads versus Bellheads. Netheads originated from the people that developed network technology whereas Bellheads originates from the Bell Laboratories – a research institution of telecommunication companies. At the core was a technical discussion whether packet-switching or circuit-switching is more useful and how big the meta-data overhead should be and how long the setup of a connection takes. However, technological development over the next decade rendered the debates irrelevant. In retrospect, it also seems to have been easier to expand the network under the internet with IP (just assign an address) whereas ATM (establish circuits). Also, the technical specifications of ATM turned very complicated to the point where they were too cumbersome.

The technical debates where just a superficial expression of an underlying discussion. The Netheads viewed the Internet from a data perspective whereas Bellheads viewed the Internet from a (continuous) signal perspective. Another issue was that Netheads came from a young industry with little to no corporate backing (market entrant)  whereas Bellheads had a century of corporate history behind their back (incumbent). The Netheads nearly waged a crusade.  It can be viewed through the US political focus as well where Netheads lean liberal and Bellheads lean conservative.


CSD: Space Syntax Theory

Space syntax is a social theory on the use of space. It encompasses a set of theories and techniques that examines relationships between man (e.g. individual/user/society) and the environment (in/outdoor).

Recommended basic readings are Lynch’s “The image of the city” (Lynch, 1960) as well as “Space is the machine” (Hillier, 2007) . Advanced readings are “The social Logic of Space” (Hillier & Hanson, 1989) , which also introduced Space Syntax.

Spatial Configuration

Spatial configuration defines how the relation between spaces A and B is modified by their relation to space C (Hillier, 2007) .

Representation of Space

Isovists, also called a view shed in geography, are the volume/area of the 360° field of view from particular points of view.  Lines of sight are used to construct the isovists. Partial isovists are also constructed to mimic the human field of view. Psychologists have suggested (but not yet quite proven) that the shapes of isovist polygons influence the behaviour of humans. Each point of view generates its own isovist. Visibility Graph Analysis converts the set of isovists into a measure of visibility.

When people move, they like to move in straight lines (confirmed in Spatial Cognition). Axial lines provide a potential for long straight lines which could be walked upon. Typical analysis chooses the minimal set of longest axial lines that allows to see the complete space.

Major assumptions of space syntax assume that people move in lines (axial lines), see changes in visual fields (VGA), and interact in convex space (which is not covered).

Measuring centrality and graphs

To convert a road network into a graph. The roads are taken as nodes and connections between roads are edges. Curves are replaced by a set of lines that mimic the curvature. Segment angular analysis splits roads into segments (according to connections to other roads), however, additionally the connections are weighted by changes of direction. Essentially, degree centrality is measured. Also, other measures of network centrality are used (see previous link). Closeness Centrality is called Integration in Space Syntax. Betweenness Centrality is called Choice in Space Syntax. Other centrality measures are currently not applied in Space Syntax.


Hillier, B. (2007). Space is the machine: a configurational theory of architecture. Space Syntax.
Hillier, B., & Hanson, J. (1989). The social logic of space. Cambridge university press.
Lynch, K. (1960). The image of the city. MIT press.


SMADSC: Social Networks

Social networks often give structure to relations. They can be considered as abstract, mathematically, tractable and computationally instantiatable systems. Social networks have become a field of their own. It is very interdisciplinary touching mathematics (graph theory), computer science (algorithms), sociology (population group trends), psychology (individual and social behaviour), and complex network theory.

Interpersonal contact caused social networks to emerge. It can be understood as a descriptor for social trends (Cioffi-Revilla, 2013) . The basic elements are Nodes (units of observation), Edges (relationships), and Aggregations (Dyads, Triads, Clique, Clusters, etc.). More advanced elements are Descriptive Properties (e.g. centrality measures).

A network can also be seen as an abstract topology and “social glue”. Agents can move around the network, by jumping from node to node, either there is a connecting edge or in general. Alternatively, nodes can be mapped onto agents, either by allowing agents to move around a raster or along the edges.

A network trades off regularity and complexity, relative size and relative complexity as well as network complexity and network connectivity.

Social Network Analysis

Social Network Analysis (SNA) is based on a machine-readable representation of a social network, i.e. an adjacency matrix. While there is no “best measure” to describe a node or edge, there are several useful descriptive properties.

Bridging and spanning nodes can be identified. Also,cliques and clusters can be identified which gives a relative density of the network. Lastly, measures  of relative Connectedness and Centrality are often used (see this post).

Social Psychology

Instead of observing the network as a whole. It can be analysed from the node perspective. Nodes can be grouped into a “self” (ego) or “other (alter). The “self”‘s purpose  is “self-motivated” action relative to their role and their subjective network knowledge. If nodes are “other” then their function is that of an arbiter or reactive agent. In this view, edges represent social connectivity in the network. They represent evidence of physical, informational, and or some other material or non-material transfer or contact between nodes. Typically, the edges suggest some social binding between individuals and/or groups of nodes. Finally, an edge often connotes implicit temporal properties. Dyads are any two connected nodes in the network, whereas triads are any three connected nodes, whereas cliques are larger. Simmelian ties are strong, bidirectional social bindings.


ISN: Network visualisation

Today’s topic will be to visualise networks and centrality measures. We visualise a network to better understand the underlying data. A visualisation should be driven by the question that we would like to answer. Nonetheless, visualisations are by their nature exploratory. Also, visualisations do not provide evidence for hypothesis.

Visualisation usually tries to convey information by the layout. Density tries to convey cohesion. Distance tries to convey graph-theoretic distance, tie length tries to convey attached values. Geometric symmetries try to convey structural symmetries.

General rules of graph visualisation is that no edge crossing, overlap, asymmetry or meaningless edge ledge/node side should occur.

Visualisation in R

We will use either the “igraph” or “sna” library to visualise the data.