In-class_Ex2

Published

November 18, 2023

Modified

November 29, 2023

Overview

Getting started

For the purpose of this in-class exercise, the Hunan datasets will be used. There are 2 data sets in this case, they are: - Hunan, a geospatial data set in ESRI shapefile format, and - Hunan_2012, an attribute data set in csv format

Loading packages and data

pacman::p_load(sf, spdep, tmap, tidyverse, knitr,sfdep, plotly, zoo)

Loading data

Geospatial data

The following uses the st_read() function from the sf package to read the geospatial data.

hunan <- st_read(dsn = "data/geospatial", 
                 layer = "Hunan")

Reading layer `Hunan' from data source 
  `C:\Users\Lian Khye\Desktop\MITB\Geospatial\geartooth\ISSS624\In-class_Ex\Ex2\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 88 features and 7 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 108.7831 ymin: 24.6342 xmax: 114.2544 ymax: 30.12812
Geodetic CRS:  WGS 84

Import attributable table

hunan2012 <- read_csv("data/aspatial/Hunan_2012.csv")

Combining the tables

Left join because of joining different data types. Here we retain the hunan dataframe and append the hunan table in order to save the geometry data automatically due to the nature of it being a spatial data.

hunan_GDPPC <- left_join(hunan,hunan2012)%>%
  select(1:4, 7, 15)

Choropleth Map

Here we will be plotting the choropleth map of the hunan_GDPPC joint data from the previous step.

tmap_mode("plot")
tm_shape(hunan_GDPPC) +
  tm_fill("GDPPC", 
          style = "quantile", 
          palette = "Blues",
          title = "GDPPC") +
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "Distribution of GDP per capita by district, Hunan Province",
            main.title.position = "center",
            main.title.size = 1.2,
            legend.height = 0.45, 
            legend.width = 0.35,
            frame = TRUE) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar() +
  tm_grid(alpha =0.2)

Spatial Weights

There are 2 types of spatial weights, contiguity and distance based. Contiguity spatial weights refer to having a common border and distance based spatial weights are based on a distance.

Contiguity Spatial Weights

Here we will first try the contiguity weights using st_contiguity() to obtain the number of neighbours, followed by st_weights() to obtain the the contiguity spatial weights.

Queen’s method

Here we will be using the Queen’s method.

nb_queen <- hunan_GDPPC %>% 
  mutate(nb = st_contiguity(geometry),
         .before = 1)
summary(nb_queen$nb)

Neighbour list object:
Number of regions: 88 
Number of nonzero links: 448 
Percentage nonzero weights: 5.785124 
Average number of links: 5.090909 
Link number distribution:

 1  2  3  4  5  6  7  8  9 11 
 2  2 12 16 24 14 11  4  2  1 
2 least connected regions:
30 65 with 1 link
1 most connected region:
85 with 11 links

The report shows that there are 88 area units or regions in the Hunan province and the area with the most number of connected neighbours is 11 and the least is 1.

One advantage of using sfdep is the output being in a table format.

kable(head(nb_queen,
           n=10))

nb	NAME_2	ID_3	NAME_3	ENGTYPE_3	County	GDPPC	geometry
2, 3, 4, 57, 85	Changde	21098	Anxiang	County	Anxiang	23667	POLYGON ((112.0625 29.75523…
1, 57, 58, 78, 85	Changde	21100	Hanshou	County	Hanshou	20981	POLYGON ((112.2288 29.11684…
1, 4, 5, 85	Changde	21101	Jinshi	County City	Jinshi	34592	POLYGON ((111.8927 29.6013,…
1, 3, 5, 6	Changde	21102	Li	County	Li	24473	POLYGON ((111.3731 29.94649…
3, 4, 6, 85	Changde	21103	Linli	County	Linli	25554	POLYGON ((111.6324 29.76288…
4, 5, 69, 75, 85	Changde	21104	Shimen	County	Shimen	27137	POLYGON ((110.8825 30.11675…
67, 71, 74, 84	Changsha	21109	Liuyang	County City	Liuyang	63118	POLYGON ((113.9905 28.5682,…
9, 46, 47, 56, 78, 80, 86	Changsha	21110	Ningxiang	County	Ningxiang	62202	POLYGON ((112.7181 28.38299…
8, 66, 68, 78, 84, 86	Changsha	21111	Wangcheng	County	Wangcheng	70666	POLYGON ((112.7914 28.52688…
16, 17, 19, 20, 22, 70, 72, 73	Chenzhou	21112	Anren	County	Anren	12761	POLYGON ((113.1757 26.82734…

Rook’s method

Here we will be using the Rook’s method.

nb_rook <- hunan_GDPPC %>% 
  mutate(nb = st_contiguity(geometry,
                            queen = FALSE),
         .before = 1)
summary(nb_rook$nb)

Neighbour list object:
Number of regions: 88 
Number of nonzero links: 440 
Percentage nonzero weights: 5.681818 
Average number of links: 5 
Link number distribution:

 1  2  3  4  5  6  7  8  9 10 
 2  2 12 20 21 14 11  3  2  1 
2 least connected regions:
30 65 with 1 link
1 most connected region:
85 with 10 links

Here we see the region with the highest number of neighbours has dropped to 10 due to loss of point or corner neighbours.

Higher order neighbours

Higher order neighbours refer to secondary neighbours that are not directly connected to the region of interest. Depending on the order number, it is how many neighbours away.

We will use **st_nb_lag_cumul()* for calculating.

nb2_queen <-  hunan_GDPPC %>% 
  mutate(nb = st_contiguity(geometry),
         nb2 = st_nb_lag_cumul(nb, 2),
         .before = 1)
summary(nb2_queen)

Neighbour list object:
Number of regions: 88 
Number of nonzero links: 448 
Percentage nonzero weights: 5.785124 
Average number of links: 5.090909 
Link number distribution:

 1  2  3  4  5  6  7  8  9 11 
 2  2 12 16 24 14 11  4  2  1 
2 least connected regions:
30 65 with 1 link
1 most connected region:
85 with 11 links
Neighbour list object:
Number of regions: 88 
Number of nonzero links: 1324 
Percentage nonzero weights: 17.09711 
Average number of links: 15.04545 
Link number distribution:

 5  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 26 28 33 
 2  1  6  4  5  4  8  5 10  4  4  8  4  8  5  2  2  1  2  1  1  1 
2 least connected regions:
30 88 with 5 links
1 most connected region:
56 with 33 links

    nb         nb2          NAME_2               ID_3          NAME_3         
 NULL:NULL   NULL:NULL   Length:88          Min.   :21098   Length:88         
                         Class :character   1st Qu.:21125   Class :character  
                         Mode  :character   Median :21150   Mode  :character  
                                            Mean   :21150                     
                                            3rd Qu.:21174                     
                                            Max.   :21201                     
  ENGTYPE_3            County              GDPPC                geometry 
 Length:88          Length:88          Min.   : 8497   POLYGON      :88  
 Class :character   Class :character   1st Qu.:14566   epsg:4326    : 0  
 Mode  :character   Mode  :character   Median :20433   +proj=long...: 0  
                                       Mean   :24405                     
                                       3rd Qu.:27224                     
                                       Max.   :88656

Here we see the number of neighbours increasing with the highest number of neighbours reaching 33 for 1 region.

Deriving contiguity weights

After calculating the number of neighbours, we can then compute the contiguity weights using st_weights().

wm_q <- hunan_GDPPC %>% 
  mutate(nb = st_contiguity(geometry),
         wt = st_weights(nb, style = "W"),
         .before = 1)
wm_q

Simple feature collection with 88 features and 8 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 108.7831 ymin: 24.6342 xmax: 114.2544 ymax: 30.12812
Geodetic CRS:  WGS 84
First 10 features:
                               nb
1                 2, 3, 4, 57, 85
2               1, 57, 58, 78, 85
3                     1, 4, 5, 85
4                      1, 3, 5, 6
5                     3, 4, 6, 85
6                4, 5, 69, 75, 85
7                  67, 71, 74, 84
8       9, 46, 47, 56, 78, 80, 86
9           8, 66, 68, 78, 84, 86
10 16, 17, 19, 20, 22, 70, 72, 73
                                                                            wt
1                                                      0.2, 0.2, 0.2, 0.2, 0.2
2                                                      0.2, 0.2, 0.2, 0.2, 0.2
3                                                       0.25, 0.25, 0.25, 0.25
4                                                       0.25, 0.25, 0.25, 0.25
5                                                       0.25, 0.25, 0.25, 0.25
6                                                      0.2, 0.2, 0.2, 0.2, 0.2
7                                                       0.25, 0.25, 0.25, 0.25
8  0.1428571, 0.1428571, 0.1428571, 0.1428571, 0.1428571, 0.1428571, 0.1428571
9             0.1666667, 0.1666667, 0.1666667, 0.1666667, 0.1666667, 0.1666667
10                      0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125
     NAME_2  ID_3    NAME_3   ENGTYPE_3    County GDPPC
1   Changde 21098   Anxiang      County   Anxiang 23667
2   Changde 21100   Hanshou      County   Hanshou 20981
3   Changde 21101    Jinshi County City    Jinshi 34592
4   Changde 21102        Li      County        Li 24473
5   Changde 21103     Linli      County     Linli 25554
6   Changde 21104    Shimen      County    Shimen 27137
7  Changsha 21109   Liuyang County City   Liuyang 63118
8  Changsha 21110 Ningxiang      County Ningxiang 62202
9  Changsha 21111 Wangcheng      County Wangcheng 70666
10 Chenzhou 21112     Anren      County     Anren 12761
                         geometry
1  POLYGON ((112.0625 29.75523...
2  POLYGON ((112.2288 29.11684...
3  POLYGON ((111.8927 29.6013,...
4  POLYGON ((111.3731 29.94649...
5  POLYGON ((111.6324 29.76288...
6  POLYGON ((110.8825 30.11675...
7  POLYGON ((113.9905 28.5682,...
8  POLYGON ((112.7181 28.38299...
9  POLYGON ((112.7914 28.52688...
10 POLYGON ((113.1757 26.82734...

The 3 arguments used are

nb - list of neighbours
style - Default will be W style, repsenting row standardisation. Others include C, global standardisation, U, C style divided by number of neighbours, and S, sums of all links to n.
allow_zero - if TRUE, it assigns 0 to regions without neighbours

Distance based weights

There are 3 difference distance based weights:

Fixed distance weights
Adaptive distance weights
inverse distance weights

Fixed distance weights

As the name suggest, this method uses a fixed distance for measuring the number of neighbours. We will first determine the upper limit using the following steps.

geo <- sf::st_geometry(hunan_GDPPC)
nb <- st_knn(geo, longlat = TRUE)
dists <- unlist(st_nb_dists(geo, nb))

st_nb_dists() is used for calculating the nearest neighbour, the output will be the distance to each neighbour for each region. unlist() is from the base R function and returns the output as a vector.

Next we will derive the statistical summary of the nearest neighbour distances.

summary(dists)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  21.56   29.11   36.89   37.34   43.21   65.80

This shows that the max distance for the nearest neigh bour is 65.8km, and when we round up to 66km, each region would have at least 1 neighbour.

With that we can then compute the fixed distance weights.

wm_fd <- hunan_GDPPC %>%
  mutate(nb = st_dist_band(geometry,
                           upper = 66),
               wt = st_weights(nb),
               .before = 1)
summary(wm_fd)

Neighbour list object:
Number of regions: 88 
Number of nonzero links: 400 
Percentage nonzero weights: 5.165289 
Average number of links: 4.545455 
Link number distribution:

 1  2  3  4  5  6  7 
 4  7 10 16 23 23  5 
4 least connected regions:
30 32 65 75 with 1 link
5 most connected regions:
41 52 58 63 80 with 7 links

    nb       wt.Length  wt.Class  wt.Mode    NAME_2               ID_3      
 NULL:NULL   6        -none-   numeric    Length:88          Min.   :21098  
             5        -none-   numeric    Class :character   1st Qu.:21125  
             4        -none-   numeric    Mode  :character   Median :21150  
             4        -none-   numeric                       Mean   :21150  
             5        -none-   numeric                       3rd Qu.:21174  
             3        -none-   numeric                       Max.   :21201  
             3        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             2        -none-   numeric                                      
             4        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             4        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             4        -none-   numeric                                      
             6        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             3        -none-   numeric                                      
             6        -none-   numeric                                      
             2        -none-   numeric                                      
             1        -none-   numeric                                      
             4        -none-   numeric                                      
             1        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             2        -none-   numeric                                      
             3        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             7        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             3        -none-   numeric                                      
             6        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             7        -none-   numeric                                      
             2        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             2        -none-   numeric                                      
             6        -none-   numeric                                      
             7        -none-   numeric                                      
             3        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             4        -none-   numeric                                      
             7        -none-   numeric                                      
             4        -none-   numeric                                      
             1        -none-   numeric                                      
             6        -none-   numeric                                      
             2        -none-   numeric                                      
             5        -none-   numeric                                      
             3        -none-   numeric                                      
             3        -none-   numeric                                      
             4        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             1        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             3        -none-   numeric                                      
             6        -none-   numeric                                      
             7        -none-   numeric                                      
             5        -none-   numeric                                      
             6        -none-   numeric                                      
             5        -none-   numeric                                      
             4        -none-   numeric                                      
             3        -none-   numeric                                      
             6        -none-   numeric                                      
             4        -none-   numeric                                      
             2        -none-   numeric                                      
    NAME_3           ENGTYPE_3            County              GDPPC      
 Length:88          Length:88          Length:88          Min.   : 8497  
 Class :character   Class :character   Class :character   1st Qu.:14566  
 Mode  :character   Mode  :character   Mode  :character   Median :20433  
                                                          Mean   :24405  
                                                          3rd Qu.:27224  
                                                          Max.   :88656  
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
                                                                         
          geometry 
 POLYGON      :88  
 epsg:4326    : 0  
 +proj=long...: 0

Here we can see that there are 4 regions with only 1 neighbour and 5 regions with 7 neighbours.

st_dists_band() can be used to identify neighbours and the fixed distance selected. st_weights() uses the W style and the allow_zero to be TRUE as default.

Adaptive distance weights

Here we will use adaptive distance weights where the distance used is dependent on the data density.

wm_ad <- hunan_GDPPC %>% 
  mutate(nb = st_knn(geometry,
                     k=8),
         wt = st_weights(nb),
               .before = 1)

st_knn() is used to determine the number of neighbours to be used for calculating distance. st_weights() is used for calculaing the polygon spatial weights of the list of neightbours. It uses the W style and the allow_zero to be TRUE as default.

Inverse distance weights

This method will create proportional weights.

wm_idw <- hunan_GDPPC %>%
  mutate(nb = st_contiguity(geometry),
         wts = st_inverse_distance(nb, geometry,
                                   scale = 1,
                                   alpha = 1),
         .before = 1)

st_contiguity() is used to identify the neighbours using a common border criteria. st_inverse_distance() is then used for calculating the inverse distance weights of the neighbours in the list.