Introduction

For anyone that knows even a bit about American politics, we know that it is very polarized. However, it was not always like this and the rise of polarization took off in the early 21st century.

In this analysis, we will be looking at how polarized congress became by analyzing the 90th congress and 116th congress. We are able to quantify this polarization by using SVD analysis on the roll calls of each congress member. With the SVD analysis, we can compute matrices that have some relation, concepts, to the voting patterns of the representatives and senators. Specifically, we will obtain three matrices as a result: one representing the relation between each member and the concept, one representing the strength of each concept, and one representing the relation between each roll call and concept. These concepts are ideas that aren’t set, but are supposed to be dug up by studying the data.

I split this analysis into two parts, where the first looks at the partisanship of the members and the second looks at the partisanship of the roll calls.

Note: For the ease of this analysis, all NA values will be converted to 0. This way, they won’t affect if the vote passed or not.



Partisanship by Members

To begin this analysis, I plotted the first two vectors of the left-most matrix, u. Where the first vector represents the partisanship of each member, and the second vector represents the bipartisanship of each member. I plotted the partisanship values on the x-axis and the bipartisanship values on the y-axis.

There is an obvious trend: in the 116th Congress, we can see that both the Democrats and Republicans grouped up into their own quadrants. On the contrary, in the 90th Congress, both the Democrats and Republicans sat around the same partisan value while having differing bipartisan values. But, what does this mean? In the 116th Congress, both of these parties are only voting with their party, meaning that there is little middle-ground.

Since we are using SVD analysis, it is important to state the energy that is lost in the analysis. When we talk about how much energy is lost in an SVD analysis, this means how much of the information is lost. Since we only use the first two given vectors out of the hundred values, we aren’t able to retain 100% of the information. Thus, some information, energy, is lost. So, on average, we lost 32.6151487% of the energy in each of the chambers in the 90th and 116th congress. In this case we keep the majority of the information, however, we lose a bit of information when only using the first two singular vectors.



Range of Partisan Values by Region

To continue the analysis of the partisanship of members, I plotted the average partisan value of each party in each region for each congress. These computations also use the u-vector, which correspond to the members of congress (same as previous plot).

Note: We aren’t including Independents since they are only present in the 116th House.

In the 90th House, all regions differed in their partisanship ranges, though they’re all similar in that Democrats tend to have a lower partisanship value than the Republicans. Further, the Northeast region seems to deviate from the pattern as Republicans have a smaller partisan value than the democrats.

In the 116th House, we can observe the bigger range or break between the democrats and republicans in the partisanship values. Compared to the 90th House, all regions seem to have the same pattern in which Democrats and Republicans have their own side (like it was seen in the scatterplot). All regions seem to have a similar range except for the Northeast.

Similar to what we saw in the 90th House, ranges in the 90th Senate varied with democrats and republicans on either side.

Again, very similar to what we saw in the 116th House, we saw that partisanship ranges became similar with Democrats and Republicans flocking to one side. However, Republicans now seem to have a lower Partisanship value compared to Democrats (saw the opposite effect in the 116th House).

All in all, with these plots, we were able to show that no matter what region a congress member was in, the polarization was still in effect. With Democrats and Republicans flocking to their own side, and holding down their fort as we saw in Figure 1.

Again, it is important to state the energy that is retained in this one-dimensional analysis. Since we are only using one of the vectors, we do lose quite a bit of energy. On average, we lost 55.2805373% of the energy from each house and senate.



Partisanship by Roll Calls

In this part, I intend to show how bills are voted down differently in the 90th and 116th Congress.

Note: A vote only passes when the ‘Yea’ votes exceed a certain threshold, 50 and 217 for senate and house accordingly.

This plot visualizes the first two columns of the v-matrix, which represents partisanship and bipartisanship of each roll call. We have a similar visual to what was shown in the beginning. However, each point now represents a roll call instead of a member of congress.

I have also labeled roll calls that were passed or voted down by the majority of Democrats, Republicans, or both. I classified who voted it down or not by looking at if the party had more “nay” votes than “yea” votes. If both had more “nay” votes than “yea” votes, that would mean that both of the parties voted it down.

Note: We are ignoring the Independents in this case since they don’t much leverage as a group, as they tend to only have a few members in congress.

Though all of these plots make a similar diamond shape, we see the disappearance of points in the center of the diamond in the 116th Congress. Overall, we see a similar pattern in that the parties once again become split in the 116th congress. Where Democrats voting down bills on one side, and Republicans voting down bills on the other. We also do not see as many bills being voted down together both by Democrats and Republicans in the 116th Congress compared to the 90th Congress.

We lose the same amount of energy as we did in Figure 1, since we are only using the first 2 vectors.



Who Voted Down Most Frequently?

We saw in the Figure 3 that voting down behavior of the Democrats and Republicans differed from the 90th and 116th Congress. That is, less of both parties voting a bill down together. Let’s table this data to see how much of the bills were voted down by the two parties.

Figure 4
90th Congress
116th Congress
House Senate House Senate
D 115 148 148 225
R 131 135 349 39
Both 20 149 6 8
Passed 212 164 230 188
Total 478 596 733 460

We can see that it was common for both sides to vote down a bill in the 90th Congress. In both the Senate and the House, Democrats and Republicans seem to vote down equal amounts of bills. However, in the 116th Congress, we can see that the it is very one-sided, like the previous findings. Though there are some bills that were passed in the 116th Congress, Republicans in the House voted down 2 times as many bills compared to Democrats, and Democrats voted down 7 times as many bills comapred to Republicans in the Senate. There are also very few bills that are voted down by both Democrats and Republicans.



More than just Bipartisanship and Partisanship?

Here, we have a 3-D plot of the first three singular vectors of the u-matrix - each point still represents a member of congress. It is very much possible there is another concept that is in play here as there is more energy still left in the matrices.

Similar to how we speculated the first two vectors to be partisanship and bipartisanship, we can only speculate what it can be. It could be religious beliefs, economic beliefs, or some other concept we don’t have the knowledge about.

In conclusion, we saw various patterns that took place in many of these figures. Throughout them all, the partisanship of Democrats and Republicans were very clear in the 116th Congress. We observed the great differences in how Democrats and Republicans acted in the 90th Congress compared to the 116th Congress, especially how the parties flocked to one side. All in all, we were able to use SVD analysis effectively to gauge the relationships that exist between hundreds of variables.

Appendix: All code for this report

# Load libraries
library(tidyverse)
library(gridExtra)
library(grid)
library(ggalt)
library(tigris)
library(patchwork)
library(cowplot)
library(plotly)
library(kableExtra)
library(formattable)

# Set code chunk options for all chunks
knitr::opts_chunk$set(message = FALSE, echo = FALSE, warning = FALSE)


# Read in data ----
house_90 <- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project2_svd/house_90_raw.csv")

senate_90 <- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project2_svd/senate_90_raw.csv")

house_116 <- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project2_svd/house_116_raw.csv")

senate_116 <- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project2_svd/senate_116_raw.csv")

region_usa <- read.csv("https://raw.githubusercontent.com/cphalpert/census-regions/master/us%20census%20bureau%20regions%20and%20divisions.csv") %>% 
  rename(c("state" = "State", "short" = "State.Code", 
           "region" = "Region", "division" = "Division"))

region_usa$state <- str_to_lower(region_usa$state)

# Set NAs to 0 ----
house_116[is.na(house_116)] <- 0

house_90[is.na(house_90)] <- 0

senate_116[is.na(senate_116)] <- 0

senate_90[is.na(senate_90)] <- 0

# Function to get info about the RC
# Input: congress type either senate or house
# Output: a dataframe containing: if the RC passed, how many votes from democrats/republicans, who voted it down
#
get_rc_info <- function(congress) {
  
  if (floor(nrow(congress) / 2) <= 100) {
    threshold <- 50
  } else {
    threshold <- 217
  }
  
  df <- data.frame("passed" = colSums(congress[-c(1:4)]) > threshold)
  
  dems <- colSums(subset(congress, party_code == "D")[-c(1:4)])
  reps <- colSums(subset(congress, party_code == "R")[-c(1:4)])
  
  if ("I" %in% congress$party_code ){
    ind <- colSums(subset(congress, party_code == "I")[-c(1:4)])
    df <- cbind(df, dems, reps, ind)
  } else {
    df <- cbind(df, dems, reps)
  }
  
  
  df <- df %>% 
    mutate(vote_down = ifelse(passed == TRUE, "Passed", 
                              ifelse(dems < 0 & reps < 0, "Both", 
                                     ifelse(reps < dems, "R", "D"))))
  
  df
}

# Function to get svd of inputted congress
# Input: congress type either senate or house, indication if you want u or v matrix
# Output: a dataframe containing: partisan axis, bipartisan axis, and label. Party and state also included if each row is congress member
#
get_svd_info <- function(congress, rc) {
  svd_congress <- svd(congress[-c(1:4)])
  
  if (rc) {
    svd_df <- data.frame("x" = svd_congress$v[, 1], 
                         "y" = svd_congress$v[, 2], 
                         "label" = colnames(congress[-c(1:4)]))
  } else {
    svd_df <- data.frame("x" = svd_congress$u[, 1], 
                         "y" = svd_congress$u[, 2], 
                         "label" = congress$bioname, 
                         "party" = congress$party_code, 
                         "state" = congress$state_abbrev,
                         "born" = congress$born)
    
    qt <- quantile(congress$born)
    
    svd_df$born_qt <- ifelse(svd_df$born < qt[[2]], "Q1", 
                             ifelse(svd_df$born < qt[[3]], 
                                    "Q2",
                                    ifelse(svd_df$born < qt[[4]], 
                                           "Q3", "Q4")))
  }
  
  svd_df
}

# Function to get a table of which party voted down the roll calls the most
# Input: congress type either senate or house
# Output: a table which consists of each category: D, R, Both, or Passed. Another column for the counts corresponding to the category.
#
rc_votedown_table <- function(congress) {
  congress_vd <- get_rc_info(congress)
  
  congress_vd$vote_down <- factor(congress_vd$vote_down, 
                                  levels = c("D", "R", "Both", "Passed"))
  
  congress_vd %>% 
    group_by(vote_down) %>% 
    summarize(n = n()) %>% 
    rename(c("Party" = "vote_down", "Count" = "n")) %>% 
    as.data.frame()
}

# Function to plot partisan and bipartisan of each roll call
# Input: congress type either senate or house
# Output: a plot of each roll call on partisan and bipartisan axis, each will be indicated if the roll call passed, or voted down by democrats, republicans or both
#
plot_rc <- function(congress) {
  if (congress == "90th House") {
    congress_df <- house_90
  } else if (congress == "116th House") {
    congress_df <- house_116
  } else if (congress == "90th Senate") {
    congress_df <- senate_90
  } else if (congress == "116th Senate") {
    congress_df <- senate_116
  } else {
    stop("Enetered wrong variable.")
  }
  
  congress_rc <- cbind(get_svd_info(congress_df, TRUE), get_rc_info(congress_df))
  
  congress_rc$vote_down <- factor(congress_rc$vote_down, 
                                  levels = c("D", "R", "Both", "Passed"))
  
  plot <- congress_rc %>% 
    ggplot() +
    geom_point(
      aes(x = x, y = y, color = vote_down, shape = vote_down == "Passed")) +
    xlim(c(-0.1, 0.1)) +
    ylim(c(-0.1, 0.1)) +
    scale_shape_manual(values = c(4, 1)) +
    scale_color_manual(
      values = c("Both" = "orange",
                 "D" = "dodgerblue4", 
                 "R" = "red3", 
                 "Passed" = "seagreen4")) +
    guides(color = guide_legend(title = "Voted Down By"), shape = FALSE) +
    theme_minimal() +
    theme(plot.title = element_text(face = "italic"),
          plot.caption = element_text(hjust = 1, color = "#696969"),
          text = element_text(family = "mono"),
          plot.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9"),
          panel.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9")) +
    labs(x = "Partisan", y = "Bipartisan", title = congress)
  
  plot
}

# Function to plot partisan and bipartisan values of each member
# Input: congress type either senate or house
# Output: a plot of each member
#
plot_member <- function(congress) {
  if (congress == "90th House") {
    congress_df <- house_90
    x_lims <- c(-0.07, 0.07)
    y_lims <- c(-0.08, 0.08)
  } else if (congress == "116th House") {
    congress_df <- house_116
    x_lims <- c(-0.07, 0.07)
    y_lims <- c(-0.08, 0.08)
  } else if (congress == "90th Senate") {
    congress_df <- senate_90
    x_lims <- c(-0.15, 0.15)
    y_lims <- c(-0.2, 0.2)
  } else if (congress == "116th Senate") {
    congress_df <- senate_116
    x_lims <- c(-0.15, 0.15)
    y_lims <- c(-0.2, 0.2)
  } else {
    stop("Enetered wrong variable.")
  }
  
  congress_mem <- get_svd_info(congress_df, FALSE)
  
  congress_mem$party <- factor(congress_mem$party, levels = c("D", "R", "I"))
  
  plot <- congress_mem %>% 
    ggplot() +
    geom_point(aes(x = x, y = y, color = party), size = 0.6) +
    scale_color_manual(
      values = c("D" = "dodgerblue4", "R" = "red3", "I" = "forestgreen")) +
    geom_hline(yintercept = 0, linetype = "dashed", color = "lightgrey") +
    geom_vline(xintercept = 0, linetype = "dashed", color = "lightgrey") +
    xlim(x_lims) +
    ylim(y_lims) +
    theme_minimal() +
    theme(plot.title = element_text(face = "italic"),
          plot.caption = element_text(hjust = 1, color = "#696969"),
          text = element_text(family = "mono"),
          plot.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9"),
          panel.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9")) +
    labs(title = congress, x = "Partisan", y = "Bipartisan")
}

# Function to plot average partisan value by region
# Input: congress type either senate or house
# Output: a dumbbell plot of each region
#
plot_partisan_region <- function(congress) {
  
  if (congress == "90th House") {
    congress_df <- house_90
  } else if (congress == "116th House") {
    congress_df <- house_116
  } else if (congress == "90th Senate") {
    congress_df <- senate_90
  } else if (congress == "116th Senate") {
    congress_df <- senate_116
  } else {
    stop("Enetered wrong variable.")
  }
  
  congress_df <- left_join(get_svd_info(congress_df, FALSE), region_usa, 
                           by = c("state" = "short"))
  
  congress_df <- congress_df %>% 
    group_by(region, party) %>% 
    summarize(mean = mean(x)) %>% 
    filter(party != "I") %>% 
    spread(party, mean)
  
  congress_df %>% 
    ggplot() +
    geom_dumbbell(aes(x = D, xend = R, y = region), 
                  size = 1.5, color="#b2b2b2",
                  size_x = 3, size_xend = 3,
                  colour_x = "dodgerblue4", colour_xend = "red3") +
    geom_text(aes(x = D, y = region, label = "D"), 
              color = "dodgerblue4", size = 3, vjust = -1.5, fontface = "bold",
              family = "mono") +
    geom_text(aes(x = R, y = region, label = "R"), 
              color = "red3", size = 3, vjust = -1.5, fontface = "bold",
              family = "mono") +
    theme_minimal() +
    theme(plot.title = element_text(face = "italic"),
          plot.caption = element_text(hjust = 1, color = "#696969"),
          text = element_text(family = "mono"),
          panel.grid.minor = element_blank(),
          panel.grid.major.x = element_blank(),
          plot.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9"),
          panel.background = element_rect(fill = "#F9F9F9", color = "#F9F9F9")) +
    labs(x = "Partisan", y = "Region", title = congress)
}

# Function to get how much energy was lost in the svd
# Input: congress type either senate or house, indication whether you want a plot or not
# Output: a dataframe or plot containing singular values and corresponding energy
#
svd_energy_lost <- function(congress, plot) {
  svd_congress <- svd(congress[-c(1:4)])
  
  sing_vals <- svd_congress$d^2
  
  energy_df <- data.frame("sing_vals" = sing_vals,
                          "energy" = cumsum(sing_vals) / sum(sing_vals))
  
  if (plot) {
    p1 <- ggplot(energy_df, aes(x = 1:nrow(energy_df), y = sing_vals)) +
      geom_point(size = 0.2) +
      labs(y = "Singular Values", x = "k") +
      theme_minimal() +
      theme(plot.title = element_text(face = "italic"),
            plot.caption = element_text(hjust = 0, color = "#696969"),
            text = element_text(family = "mono"),
            legend.position = "none",
            plot.background = element_rect(fill = "#F9F9F9", color = "white"))
    p2 <- ggplot(energy_df, aes(x = 1:nrow(energy_df), y = energy)) +
      geom_point(size = 0.2) +
      labs(y = "Cumulative Energy", x = "k") +
      theme_minimal() +
      theme(plot.title = element_text(face = "italic"),
            plot.caption = element_text(hjust = 1, color = "#696969"),
            text = element_text(family = "mono"),
            legend.position = "none",
            plot.background = element_rect(fill = "#F9F9F9", color = "white"))
    
    return(grid.arrange(p1, p2, ncol = 2))
  } else {
    return(energy_df)
  }
}

# Energy Lost
avg_energy_lost <- 
  (1 - 
     (svd_energy_lost(house_90, FALSE)[2,]$energy + 
                           svd_energy_lost(house_116, FALSE)[2,]$energy + svd_energy_lost(senate_90, FALSE)[2,]$energy +
                           svd_energy_lost(senate_116, FALSE)[2,]$energy) / 4
   ) * 100

# Figure 1 ----
house_90_plot <- plot_member("90th House")

house_116_plot <- plot_member("116th House")

senate_90_plot <- plot_member("90th Senate")

senate_116_plot <- plot_member("116th Senate")

combined <- house_90_plot + house_116_plot + senate_90_plot + senate_116_plot & theme(legend.position = "bottom")
combined + plot_layout(guides = "collect") + labs(caption = "Figure 1")

# Figure 2.1 ----
plot_partisan_region("90th House") + 
  labs(caption = "Figure 2.1")

# Figure 2.2 ----
plot_partisan_region("116th House") + 
  labs(caption = "Figure 2.2")

# Figure 2.3 ----
plot_partisan_region("90th Senate") + 
  labs(caption = "Figure 2.3")

# Figure 2.4 ----
plot_partisan_region("116th Senate") + 
  labs(caption = "Figure 2.4")

# Energy Lost
avg_energy_lost_1d <- (1 - (svd_energy_lost(house_90, FALSE)[1,]$energy + svd_energy_lost(house_116, FALSE)[1,]$energy + 
                              svd_energy_lost(senate_90, FALSE)[1,]$energy + svd_energy_lost(senate_116, FALSE)[1,]$energy) / 4) * 100
# Figure 3 ----
p1 <- plot_rc("90th House") 
p2 <- plot_rc("116th House")
p3 <- plot_rc("90th Senate")
p4 <- plot_rc("116th Senate")

combined <- p1 + p2 + p3 + p4 & theme(legend.position = "bottom", plot.caption = element_text(hjust = 1, vjust = 1,face = "italic"))
combined + plot_layout(guides = "collect", widths = 1)  + 
  labs(caption = "Figure 3")

# Figure 4 ----
vd_h90 <- rc_votedown_table(house_90)
vd_h116 <- rc_votedown_table(house_116)
vd_s90 <- rc_votedown_table(senate_90)
vd_s116 <- rc_votedown_table(senate_116)

all_votedown <- cbind(vd_h90, vd_s90[, 2], vd_h116[, 2], vd_s116[, 2])

sum_rc <- c("Total", colSums(all_votedown[-1]))

colnames(all_votedown) <- c("party", "h90", "s90", "h116", "s116")

passed <- all_votedown %>% 
  tail(1)

all_votedown %>% 
  head(-1) %>% 
  mutate(
    party = c("D", "R", "Both"),
    h90 = color_tile("#DeF7E9", "#71CA97")(h90),
    s90 = color_tile("#DeF7E9", "#71CA97")(s90),
    h116 = color_tile("#DeF7E9", "#71CA97")(h116),
    s116 = color_tile("#DeF7E9", "#71CA97")(s116)
  ) %>% 
  rbind(passed, sum_rc) %>% 
  select(party, everything()) %>% 
  kable(col.names = c("" ,"House", "Senate", "House", "Senate"), 
        align = "ccccc", "html", 
        escape = F,
        caption = "<i>Figure 4</i>") %>%
  row_spec(1:5, background = "#F9F9F9") %>%
  kable_paper() %>% 
  kable_styling("hover", full_width = F) %>%
  add_header_above(c(" ", "90th Congress" = 2, "116th Congress" = 2)) %>% 
  kable_styling(bootstrap_options = "hover", 
                html_font = "Andale Mono", 
                full_width = TRUE)
# Figure 5 ----
svd_congress <- svd(house_116[-c(1:4)])

svd_df <- data.frame("x" = svd_congress$u[, 1], 
                     "y" = svd_congress$u[, 2], 
                     "z" = svd_congress$u[, 3],
                     "label" = house_116$bioname, 
                     "party" = house_116$party_code, 
                     "state" = house_116$state_abbrev,
                     "born" = house_116$born)

plot_ly(
  data = svd_df, 
  x = ~x, y = ~y, z = ~z, 
  type = "scatter3d", 
  mode = "markers", 
  colors = c("D" = "dodgerblue4", "R" = "red3", "I" = "forestgreen"), 
  color = ~party, 
  size = 0.5, 
  textfont = list("Andale Mono")) %>% 
  layout(annotations = 
           list(x = 1, y = -0.1, text = sprintf("<i>%s</i>", "Figure 5"), 
                showarrow = F, xref = 'paper', yref = 'paper', 
                xanchor = 'right', yanchor = 'auto', 
                xshift = 0, yshift = 0, 
                font = list(size = 12, family = "Andale Mono")))