2023-09-13 PMMoV qPCR Analysis

Author

Dan Rice

Published

September 25, 2023

Objectives

NOTE: Need sample metadata. I don’t know what the samples are.

  • Test qPCR of PMMoV
  • Test duplexed qPCR
  • See Google Doc

Preliminary work

  • Dan copied the .eds file from the NAO qPCR data directory to the experiment directory and exported .csv files.

Data import

library(readr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(purrr)
library(stringr)
library(ggplot2)
library(tidyr)
library(broom)
data_dir <-
  "~/airport/[2023-09-13] PMMoV and Multiplex qPCR/"
filename_pattern <- "Results"
col_types <- list(
  Target = col_character(),
  Cq = col_double()
)
raw_data <- list.files(
  paste0(data_dir, "qpcr"),
  pattern = filename_pattern,
  full.names = TRUE,
) |>
  print() |>
  map(function(f) {
    read_csv(f,
      skip = 23,
      col_types = col_types,
    )
  }) |>
  list_rbind()
[1] "/Users/dan/airport/[2023-09-13] PMMoV and Multiplex qPCR/qpcr/Ari PMMoV_Results_20230925_115306.csv"
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
print(raw_data)
# A tibble: 63 × 21
    Well `Well Position` Omit  Sample Target Task    Reporter Quencher
   <dbl> <chr>           <lgl> <chr>  <chr>  <chr>   <chr>    <chr>   
 1     1 A1              FALSE 1      PPMoV  UNKNOWN VIC      NFQ-MGB 
 2     2 A2              FALSE 1      PPMoV  UNKNOWN VIC      NFQ-MGB 
 3     3 A3              FALSE 1      PPMoV  UNKNOWN VIC      NFQ-MGB 
 4     5 A5              FALSE 1      N2     UNKNOWN FAM      NFQ-MGB 
 5     6 A6              FALSE 1      N2     UNKNOWN FAM      NFQ-MGB 
 6     8 A8              FALSE NTC    N2     UNKNOWN FAM      NFQ-MGB 
 7    10 A10             FALSE 1      N2     UNKNOWN FAM      NFQ-MGB 
 8    10 A10             FALSE 1      PPMoV  UNKNOWN VIC      NFQ-MGB 
 9    11 A11             FALSE 1      N2     UNKNOWN FAM      NFQ-MGB 
10    11 A11             FALSE 1      PPMoV  UNKNOWN VIC      NFQ-MGB 
# ℹ 53 more rows
# ℹ 13 more variables: `Amp Status` <chr>, `Amp Score` <dbl>,
#   `Curve Quality` <lgl>, `Result Quality Issues` <lgl>, Cq <dbl>,
#   `Cq Confidence` <dbl>, `Cq Mean` <dbl>, `Cq SD` <dbl>,
#   `Auto Threshold` <lgl>, Threshold <dbl>, `Auto Baseline` <lgl>,
#   `Baseline Start` <dbl>, `Baseline End` <dbl>

It looks like the software splits duplexed targets into two rows, but doesn’t save the information about multiplexing.

raw_data |> count(Target, Reporter, Sample, Quencher)
# A tibble: 19 × 5
   Target Reporter Sample Quencher     n
   <chr>  <chr>    <chr>  <chr>    <int>
 1 N2     FAM      1      NFQ-MGB      5
 2 N2     FAM      2      NFQ-MGB      5
 3 N2     FAM      NTC    NFQ-MGB      2
 4 N2     FAM      SC1    NFQ-MGB      3
 5 N2     FAM      SC2    NFQ-MGB      3
 6 N2     FAM      SC3    NFQ-MGB      3
 7 N2     FAM      SC4    NFQ-MGB      3
 8 N2     FAM      SC5    NFQ-MGB      3
 9 N2     FAM      SC6    NFQ-MGB      3
10 N2     FAM      <NA>   NFQ-MGB      2
11 PPMoV  VIC      1      NFQ-MGB      6
12 PPMoV  VIC      2      NFQ-MGB      6
13 PPMoV  VIC      NTC    NFQ-MGB      4
14 PPMoV  VIC      SC1    NFQ-MGB      1
15 PPMoV  VIC      SC2    NFQ-MGB      3
16 PPMoV  VIC      SC3    NFQ-MGB      3
17 PPMoV  VIC      SC4    NFQ-MGB      3
18 PPMoV  VIC      SC5    NFQ-MGB      3
19 PPMoV  VIC      <NA>   NFQ-MGB      2
raw_data |>
  count(`Well Position`) |>
  print(n = Inf)
# A tibble: 54 × 2
   `Well Position`     n
   <chr>           <int>
 1 A1                  1
 2 A10                 2
 3 A11                 2
 4 A12                 2
 5 A2                  1
 6 A3                  1
 7 A5                  1
 8 A6                  1
 9 A8                  1
10 B1                  1
11 B10                 2
12 B11                 2
13 B12                 2
14 B2                  1
15 B3                  1
16 B5                  1
17 B6                  1
18 C1                  1
19 C10                 2
20 C11                 2
21 C12                 2
22 C2                  1
23 C3                  1
24 C5                  1
25 C6                  1
26 C7                  1
27 D3                  1
28 D5                  1
29 D6                  1
30 D7                  1
31 E1                  1
32 E2                  1
33 E3                  1
34 E5                  1
35 E6                  1
36 E7                  1
37 F1                  1
38 F2                  1
39 F3                  1
40 F5                  1
41 F6                  1
42 F7                  1
43 G1                  1
44 G2                  1
45 G3                  1
46 G5                  1
47 G6                  1
48 G7                  1
49 H1                  1
50 H2                  1
51 H3                  1
52 H5                  1
53 H6                  1
54 H7                  1

We can count the multiplexing and save it as a column:

with_counts <- raw_data |>
  left_join(count(raw_data, Well, name = "n_multiplex"), by = join_by(Well))

It looks like the multiplexed wells didn’t amplifiy well. No N2 wells amplified. It also looks like there was poor amp in general.

with_counts |>
  count(Target, Reporter, Sample, n_multiplex, `Amp Status`) |>
  print(n = Inf)
# A tibble: 32 × 6
   Target Reporter Sample n_multiplex `Amp Status`     n
   <chr>  <chr>    <chr>        <int> <chr>        <int>
 1 N2     FAM      1                1 AMP              1
 2 N2     FAM      1                1 NO_AMP           1
 3 N2     FAM      1                2 NO_AMP           3
 4 N2     FAM      2                1 NO_AMP           2
 5 N2     FAM      2                2 NO_AMP           3
 6 N2     FAM      NTC              1 NO_AMP           1
 7 N2     FAM      NTC              2 NO_AMP           1
 8 N2     FAM      SC1              1 AMP              1
 9 N2     FAM      SC1              1 NO_AMP           2
10 N2     FAM      SC2              1 AMP              2
11 N2     FAM      SC2              1 NO_AMP           1
12 N2     FAM      SC3              1 AMP              2
13 N2     FAM      SC3              1 NO_AMP           1
14 N2     FAM      SC4              1 AMP              2
15 N2     FAM      SC4              1 NO_AMP           1
16 N2     FAM      SC5              1 AMP              2
17 N2     FAM      SC5              1 NO_AMP           1
18 N2     FAM      SC6              1 AMP              1
19 N2     FAM      SC6              1 NO_AMP           2
20 N2     FAM      <NA>             2 NO_AMP           2
21 PPMoV  VIC      1                1 AMP              3
22 PPMoV  VIC      1                2 AMP              3
23 PPMoV  VIC      2                1 AMP              3
24 PPMoV  VIC      2                2 AMP              3
25 PPMoV  VIC      NTC              1 NO_AMP           3
26 PPMoV  VIC      NTC              2 NO_AMP           1
27 PPMoV  VIC      SC1              1 AMP              1
28 PPMoV  VIC      SC2              1 AMP              3
29 PPMoV  VIC      SC3              1 AMP              3
30 PPMoV  VIC      SC4              1 AMP              3
31 PPMoV  VIC      SC5              1 AMP              3
32 PPMoV  VIC      <NA>             2 NO_AMP           2
with_counts |>
  count(Target, n_multiplex, `Amp Status`) |>
  print(n = Inf)
# A tibble: 7 × 4
  Target n_multiplex `Amp Status`     n
  <chr>        <int> <chr>        <int>
1 N2               1 AMP             11
2 N2               1 NO_AMP          12
3 N2               2 NO_AMP           9
4 PPMoV            1 AMP             19
5 PPMoV            1 NO_AMP           3
6 PPMoV            2 AMP              6
7 PPMoV            2 NO_AMP           3

Check that the thresholds are the same for every well with the same target

with_counts |>
  count(Target, Threshold) |>
  print(n = Inf)
# A tibble: 2 × 3
  Target Threshold     n
  <chr>      <dbl> <int>
1 N2          5.82    32
2 PPMoV       1.53    31

Amplification curves

amp_data <- list.files(
  paste0(data_dir, "qpcr"),
  pattern = "Amplification Data",
  full.names = TRUE,
) |>
  map(function(f) {
    read_csv(f,
      skip = 23,
      col_types = col_types,
    )
  }) |>
  list_rbind() |>
  left_join(with_counts,
    by = join_by(Well, `Well Position`, Sample, Target)
  ) |>
  print()
Warning: The following named parsers don't match the column names: Cq
# A tibble: 2,520 × 26
    Well `Well Position` `Cycle Number` Target    Rn     dRn Sample Omit.x
   <dbl> <chr>                    <dbl> <chr>  <dbl>   <dbl> <chr>  <lgl> 
 1     1 A1                           1 PPMoV   13.5 -1.41   1      FALSE 
 2     1 A1                           2 PPMoV   13.8 -1.23   1      FALSE 
 3     1 A1                           3 PPMoV   14.4 -0.752  1      FALSE 
 4     1 A1                           4 PPMoV   14.8 -0.474  1      FALSE 
 5     1 A1                           5 PPMoV   15.2 -0.290  1      FALSE 
 6     1 A1                           6 PPMoV   15.5 -0.0864 1      FALSE 
 7     1 A1                           7 PPMoV   15.8  0.0333 1      FALSE 
 8     1 A1                           8 PPMoV   16.0  0.138  1      FALSE 
 9     1 A1                           9 PPMoV   16.3  0.272  1      FALSE 
10     1 A1                          10 PPMoV   16.5  0.302  1      FALSE 
# ℹ 2,510 more rows
# ℹ 18 more variables: Omit.y <lgl>, Task <chr>, Reporter <chr>,
#   Quencher <chr>, `Amp Status` <chr>, `Amp Score` <dbl>,
#   `Curve Quality` <lgl>, `Result Quality Issues` <lgl>, Cq <dbl>,
#   `Cq Confidence` <dbl>, `Cq Mean` <dbl>, `Cq SD` <dbl>,
#   `Auto Threshold` <lgl>, Threshold <dbl>, `Auto Baseline` <lgl>,
#   `Baseline Start` <dbl>, `Baseline End` <dbl>, n_multiplex <int>

Amplification curves by sample

The one that amplified:

amp_data |>
  filter(`Amp Status` == "AMP") |>
  ggplot(mapping = aes(
    x = `Cycle Number`,
    y = dRn,
    group = Well,
    color = as.factor(n_multiplex),
  )) +
  geom_line() +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold,
  )) +
  scale_y_log10() +
  facet_grid(
    rows = vars(Target), cols = vars(Sample), scales = "free_y"
  )
Warning in self$trans$transform(x): NaNs produced
Warning: Transformation introduced infinite values in continuous y-axis
Warning: Removed 169 rows containing missing values (`geom_line()`).

The ones that didn’t:

amp_data |>
  filter(`Amp Status` == "NO_AMP") |>
  ggplot(mapping = aes(
    x = `Cycle Number`,
    y = dRn,
    group = Well,
    color = as.factor(n_multiplex),
  )) +
  geom_line() +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold,
  )) +
  scale_y_log10() +
  facet_grid(
    rows = vars(Target), cols = vars(Sample), scales = "free_y"
  )
Warning in self$trans$transform(x): NaNs produced
Warning: Transformation introduced infinite values in continuous y-axis
Warning: Removed 187 rows containing missing values (`geom_line()`).

Amplification curves by well

amp_data |>
  ggplot(mapping = aes(
    x = `Cycle Number`,
    y = dRn,
    color = as.factor(Target),
  )) +
  geom_line() +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold
  )) +
  geom_vline(aes(xintercept = `Cq`, color = as.factor(Target)),
    linetype = "dashed"
  ) +
  scale_y_log10() +
  facet_wrap(~`Well Position`, scales = "free")
Warning in self$trans$transform(x): NaNs produced
Warning: Transformation introduced infinite values in continuous y-axis
Warning: Removed 15 rows containing missing values (`geom_line()`).
Warning: Removed 960 rows containing missing values (`geom_vline()`).