Warning: This website has been archived and some features are not available anymore.

Closeness Centrality

The closeness centrality measures the average shortest path, also known as the geodesic distance, through a network between two vertices (Newman, 2010). It was first published by Sabidussi (1966). Since the closeness is the opposite of the distance, the value of the average shortest path is inverted, to get the value for the closeness centrality. A higher value for the closeness centrality means that a vertex is closer to all other vertices (Scott, 2012) and quicker to interact with all others vertices (Wasserman et al., 1994).

Because the closeness centrality measures the distance from one specific node to all others, it is of ego-centric scope. This metric was proposed by Smith et al. (2009), Hacker et al. (2015), Viol et al. (2016) and Berger et al. (2014).

Calculation

The calculation is based on Wasserman (1994). To calculate the closeness centrality, the number of edges between the two nodes $v_i \in V$ and $v_j \in V$ for the shortest path is defined as the $distance(v_i, v_j)$ -- this equates to the shortest path length or the geodesic distance.

To get the closeness centrality $c$, the sum of all distances is calculated and inverted: $$ c(v_i) = \left[\sum_{v_j \in V}distance(v_i, v_j)\right]^{-1}\quad\text{, where }v_j \neq v_i . $$

For comparison of networks with different sizes, Wasserman (1994) as well as Freeman (1979) picked up the suggestion by Beauchamp (1965) to standardize the metric through multiplying with the network size $g$ minus one: $$ c^{'}(v_i) = (g-1) * \left[\sum_{v_j \in V}distance(v_i, v_j)\right]^{-1}\quad\text{, where }v_j \neq v_i . $$

Because the formula yields an infinite value for disconnected nodes, it is problematic. While Newman (2010) suggests to use the inverse distance instead of the distance, he acknowledges that this is rarely used in practice. Instead the formula variant from the igraph package is used, which states that "if there is no path between vertex $v_i$ and $v_j$ then the total number of vertices is used in the formula instead of the path length"^igc.

Therefore we define a new distance function as: $$ distance^{'}(v_i, v_j) = \begin{cases} distance(v_i, v_j) & \text{, if there is a path} \\ g & \text{, otherwise}. \end{cases} $$

The final formula looks like this: $$c^{'}(v_i) = (g-1) * \left[\sum_{v_j \in V}distance^{'}(v_i, v_j)\right]^{-1}\quad\text{, where }v_j \neq v_i .$$

Interpretation

The value can be interpreted in the context of Social Networks. Newman (2010) mentions that a user with high closeness centrality is able to quickly respond and interact with other users due to the short distance. Such a user can efficiently disseminate information through the network because of the short communication paths to others. This argument is confirmed by Berger et al. (2014), who state that users with high closeness centrality are able to spread information easily. Similar to the interpretation of degree centrality, Berger et al. (2014) further claim that a high closeness centrality is related to being a key user. Key users are knowledge hubs, meaning they contribute and help other users to solve their daily problems. They are able to diffuse innovative ideas quickly to other people.

Smith et al. (2009) relate high closeness centrality to users, who regularly spawn new discussions and ideas as well as take part in other users' threads. Contrary, a low closeness centrality is related to people, who tend to reply in other people threads only, but do not initiate discussions on their own. The notion of engagement is introduced by Hacker et al. (2015). On the one hand a high closeness centrality indicates high levels of continuous engagement by a user in the Enterprise Social Network, but on the other hand such a user is not very focused i.e. he does talk about multiple topics compared to being an expert in one topic.

Hacker et al. (2015) draw connections to Viegas (2004), who researched that the closeness centrality can be related to frequency of posts. A high closeness centrality and degree centrality indicate a high post count, which is discussed in the metric messages created. Furthermore Hacker et al. (2015) links to Holtzblatt (2013) and their results on valuable themes of social platform experience. They claim a high engagement supports collaboration and facilitates cooperation with staff in other locations. It also strengthens social connection, expanding a user's network and tracking other people's activities. Since a high closeness centrality implies that you can easily communicate with all other users, both of these statements are reasonable to make.

Viol et al. (2016) picks up the interpretation of continuous engagement in an Enterprise Social Network. A high closeness centrality indicates that a user is well connected within the network. They are always online and active and therefore can initiate and take part in multiple discussions. This is also means that they are not focused on one topic, but rather dispersed across a lot of discussions and threads (Viol, 2016), which fits to the interpretation in the other literature.

Similar to the degree centrality: if multiple users exhibit a high closeness centrality, it leads to a dense and cohesive network. The short distance between all users results in strong ties between the users. Strong ties are a reason for Social Capital and the formation of effective norms and trust (Riemer, 2005). According to Coleman (1990) effective norms and trust among the users allow for successful collaboration.

The interpretation that information can quickly be disseminated, fits to the effect of Social Capital described by Nahapiet et al. (1998). They argue that "Social Capital constitutes a valuable source of information benefits" (p. 252). It manifests itself in the distribution of information, making information readily available. Because of the established trust through strong ties, specifically the distribution of complex or sensitive information is positively influenced by Social Capital as noted by Riemer (2005) and Koka et al. (2002). Whereas the dissemination of arbitrary and small information is more positively influenced by weak ties and thus a low closeness centrality.