2023-10-16 Analysis of recent protocol-comparison experiments

Author

Dan Rice

Published

October 16, 2023

Objectives

See Twist

Preliminary work

Exported csv files from Olivia’s eds file uploads. Also exported metadata google sheets as CSV

Data import

library(here)

here() starts at /Users/dan/notebook

library(readr)
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(purrr)
library(stringr)
library(ggplot2)
library(tidyr)
library(broom)

get_plate <- function(f) {
  str_extract(basename(f),
    "(.*)_(.*)_[0-9]{8}_[0-9]{6}\\.csv",
    group = 1
  )
}

data_dir <- here("~", "airport")
experiments <- c(
  paste(
    "[2023-10-12] Settled Solids Protocol Development,",
    "Vortex Time and Centrifuge Settings"
  ),
  "[2023-10-10] Daily Processing Protocol Testing",
  "[2023-09-22] New Processing Tests"
)

filename_pattern <- "_Results_"
col_types <- list(
  Target = col_character(),
  Cq = col_double(),
  TreatmentGroup = col_character()
)
raw_data <- list.files(
  map_chr(experiments, function(exp) {
    here(data_dir, exp, "qpcr")
  }),
  pattern = filename_pattern,
  recursive = TRUE,
  full.names = TRUE,
) |>
  print() |>
  map(function(f) {
    read_csv(f, skip = 23, col_types = col_types) |>
      mutate(plate = get_plate(f))
  }) |>
  list_rbind() |>
  glimpse()

[1] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_Cov2_PMMV_Results_20231010_125053.csv"                                                         
[2] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_CrA_16S_Results_20231010_125152.csv"                                                           
[3] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_Noro_Results_20231010_125241.csv"                                                              
[4] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_16S_Results_20231016_105057.csv"           
[5] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_Cov2_CORRECTED_Results_20231016_133517.csv"
[6] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_CrA_Results_20231016_104600.csv"           
[7] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_Noro_Results_20231016_130005.csv"          
[8] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_PMMoV_Results_20231016_104527.csv"

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names: TreatmentGroup
The following named parsers don't match the column names: TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Warning: The following named parsers don't match the column names:
TreatmentGroup

Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)

Rows: 481
Columns: 22
$ Well                    <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,…
$ `Well Position`         <chr> "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8"…
$ Omit                    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ Sample                  <chr> "1A", "1A", "1A", "10000.0", "10000.0", "10000…
$ Target                  <chr> "Cov2", "Cov2", "Cov2", "Cov2", "Cov2", "Cov2"…
$ Task                    <chr> "UNKNOWN", "UNKNOWN", "UNKNOWN", "STANDARD", "…
$ Reporter                <chr> "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM…
$ Quencher                <chr> "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "N…
$ `Amp Status`            <chr> "AMP", "AMP", "AMP", "AMP", "AMP", "AMP", "AMP…
$ `Amp Score`             <dbl> 1.3915582, 1.4014582, 1.4073581, 1.4090574, 1.…
$ `Curve Quality`         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `Result Quality Issues` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ Cq                      <dbl> 33.15919, 32.98389, 32.66178, 22.39386, 22.220…
$ `Cq Confidence`         <dbl> 0.9759085, 0.9891340, 0.9883204, 0.9891666, 0.…
$ `Cq Mean`               <dbl> 32.93495, 32.93495, 32.93495, 22.31821, 22.318…
$ `Cq SD`                 <dbl> 0.25229116, 0.25229116, 0.25229116, 0.08875503…
$ `Auto Threshold`        <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ Threshold               <dbl> 0.2999157, 0.2999157, 0.2999157, 0.2999157, 0.…
$ `Auto Baseline`         <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ `Baseline Start`        <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ `Baseline End`          <dbl> 27, 27, 26, 16, 16, 16, 24, 23, 23, 20, 20, 19…
$ plate                   <chr> "2023-10-09_Cov2_PMMV", "2023-10-09_Cov2_PMMV"…

metadata_file <- here(
  data_dir,
  experiments[1],
  "metadata.csv"
)
metadata <- experiments |>
  map(function(exp) {
    read_csv(here(data_dir, exp, "metadata.csv"), col_types = col_types)
  }) |>
  list_rbind() |>
  glimpse()

Warning: The following named parsers don't match the column names: Target, Cq
The following named parsers don't match the column names: Target, Cq
The following named parsers don't match the column names: Target, Cq

Rows: 21
Columns: 13
$ Sample_ID         <chr> "1-1", "1-2", "2-1", "2-2", "3-1", "3-2", "4-1", "4-…
$ TreatmentGroup    <chr> "1", "1", "2", "2", "3", "3", "4", "4", "Centrifuge …
$ VortexMin         <dbl> 20, 20, 20, 20, 5, 5, 5, 5, NA, NA, NA, NA, NA, NA, …
$ CFSpeed           <dbl> 10000, 10000, 3500, 3500, 10000, 10000, 3500, 3500, …
$ CollectionDate    <date> 2023-10-12, 2023-10-12, 2023-10-12, 2023-10-12, 202…
$ Source            <chr> "Solids", "Solids", "Solids", "Solids", "Solids", "S…
$ Volume            <dbl> 15, 15, 15, 15, 15, 15, 15, 15, 200, 200, 20, 200, 2…
$ Qubit_ID          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ ProcessingHandler <chr> "Ari", "Ari", "Ari", "Ari", "Ari", "Ari", "Ari", "Ar…
$ ExtractionHandler <chr> "Ari", "Ari", "Ari", "Ari", "Ari", "Ari", "Ari", "Ar…
$ `qPCR date`       <date> 2023-10-14, 2023-10-14, 2023-10-14, 2023-10-14, 202…
$ qPCRHandler       <chr> "Olivia", "Olivia", "Olivia", "Olivia", "Olivia", "O…
$ qPCR_dilution     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …

tidy_data <- raw_data |>
  separate_wider_regex(
    `Well Position`,
    c(well_row = "[A-Z]+", well_col = "[0-9]+"),
    cols_remove = FALSE,
  ) |>
  left_join(metadata, by = join_by(Sample == Sample_ID)) |>
  mutate(Target = if_else(Target == "PMMV", "PMMoV", Target)) |>
  glimpse()

Rows: 481
Columns: 36
$ Well                    <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,…
$ well_row                <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "…
$ well_col                <chr> "1", "2", "3", "4", "5", "6", "7", "8", "9", "…
$ `Well Position`         <chr> "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8"…
$ Omit                    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ Sample                  <chr> "1A", "1A", "1A", "10000.0", "10000.0", "10000…
$ Target                  <chr> "Cov2", "Cov2", "Cov2", "Cov2", "Cov2", "Cov2"…
$ Task                    <chr> "UNKNOWN", "UNKNOWN", "UNKNOWN", "STANDARD", "…
$ Reporter                <chr> "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM…
$ Quencher                <chr> "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "N…
$ `Amp Status`            <chr> "AMP", "AMP", "AMP", "AMP", "AMP", "AMP", "AMP…
$ `Amp Score`             <dbl> 1.3915582, 1.4014582, 1.4073581, 1.4090574, 1.…
$ `Curve Quality`         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `Result Quality Issues` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ Cq                      <dbl> 33.15919, 32.98389, 32.66178, 22.39386, 22.220…
$ `Cq Confidence`         <dbl> 0.9759085, 0.9891340, 0.9883204, 0.9891666, 0.…
$ `Cq Mean`               <dbl> 32.93495, 32.93495, 32.93495, 22.31821, 22.318…
$ `Cq SD`                 <dbl> 0.25229116, 0.25229116, 0.25229116, 0.08875503…
$ `Auto Threshold`        <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ Threshold               <dbl> 0.2999157, 0.2999157, 0.2999157, 0.2999157, 0.…
$ `Auto Baseline`         <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ `Baseline Start`        <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ `Baseline End`          <dbl> 27, 27, 26, 16, 16, 16, 24, 23, 23, 20, 20, 19…
$ plate                   <chr> "2023-10-09_Cov2_PMMV", "2023-10-09_Cov2_PMMV"…
$ TreatmentGroup          <chr> "Centrifuge + 0.45 filter", "Centrifuge + 0.45…
$ VortexMin               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ CFSpeed                 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ CollectionDate          <date> 2023-10-02, 2023-10-02, 2023-10-02, NA, NA, N…
$ Source                  <chr> "N-S mix", "N-S mix", "N-S mix", NA, NA, NA, "…
$ Volume                  <dbl> 280, 280, 280, NA, NA, NA, 280, 280, 280, NA, …
$ Qubit_ID                <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ ProcessingHandler       <chr> "Ari", "Ari", "Ari", NA, NA, NA, "Ari", "Ari",…
$ ExtractionHandler       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `qPCR date`             <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ qPCRHandler             <chr> "Olivia", "Olivia", "Olivia", NA, NA, NA, "Oli…
$ qPCR_dilution           <chr> "1:5", "1:5", "1:5", NA, NA, NA, "1:5", "1:5",…

amp_data <- list.files(
  map_chr(experiments, function(exp) {
    here(data_dir, exp, "qpcr")
  }),
  pattern = "Amplification Data",
  recursive = TRUE,
  full.names = TRUE,
) |>
  print() |>
  map(function(f) {
    read_csv(f,
      skip = 23,
      col_types = col_types,
    ) |>
      mutate(plate = get_plate(f))
  }) |>
  list_rbind() |>
  mutate(Target = if_else(Target == "PMMV", "PMMoV", Target)) |>
  left_join(tidy_data,
    by = join_by(plate, Well, `Well Position`, Sample, Omit, Target)
  ) |>
  glimpse()

[1] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_Cov2_PMMV_Amplification Data_20231010_125053.csv"                                                         
[2] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_CrA_16S_Amplification Data_20231010_125152.csv"                                                           
[3] "/Users/dan/airport/[2023-09-22] New Processing Tests/qpcr/2023-10-09_Noro_Amplification Data_20231010_125241.csv"                                                              
[4] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_16S_Amplification Data_20231016_105058.csv"           
[5] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_Cov2_CORRECTED_Amplification Data_20231016_133517.csv"
[6] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_CrA_Amplification Data_20231016_104600.csv"           
[7] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_Noro_Amplification Data_20231016_130005.csv"          
[8] "/Users/dan/airport/[2023-10-12] Settled Solids Protocol Development, Vortex Time and Centrifuge Settings/qpcr/2023-10-14_PMMoV_Amplification Data_20231016_104527.csv"

Warning: The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup
The following named parsers don't match the column names: Cq, TreatmentGroup

Rows: 19,240
Columns: 39
$ Well                    <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ `Well Position`         <chr> "A1", "A1", "A1", "A1", "A1", "A1", "A1", "A1"…
$ `Cycle Number`          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,…
$ Target                  <chr> "Cov2", "Cov2", "Cov2", "Cov2", "Cov2", "Cov2"…
$ Rn                      <dbl> 0.6443956, 0.6382574, 0.6284555, 0.6179680, 0.…
$ dRn                     <dbl> 2.836028e-02, 2.374099e-02, 1.545804e-02, 6.48…
$ Sample                  <chr> "1A", "1A", "1A", "1A", "1A", "1A", "1A", "1A"…
$ Omit                    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ plate                   <chr> "2023-10-09_Cov2_PMMV", "2023-10-09_Cov2_PMMV"…
$ well_row                <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "…
$ well_col                <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "…
$ Task                    <chr> "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "U…
$ Reporter                <chr> "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM…
$ Quencher                <chr> "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "NFQ-MGB", "N…
$ `Amp Status`            <chr> "AMP", "AMP", "AMP", "AMP", "AMP", "AMP", "AMP…
$ `Amp Score`             <dbl> 1.391558, 1.391558, 1.391558, 1.391558, 1.3915…
$ `Curve Quality`         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `Result Quality Issues` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ Cq                      <dbl> 33.15919, 33.15919, 33.15919, 33.15919, 33.159…
$ `Cq Confidence`         <dbl> 0.9759085, 0.9759085, 0.9759085, 0.9759085, 0.…
$ `Cq Mean`               <dbl> 32.93495, 32.93495, 32.93495, 32.93495, 32.934…
$ `Cq SD`                 <dbl> 0.2522912, 0.2522912, 0.2522912, 0.2522912, 0.…
$ `Auto Threshold`        <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ Threshold               <dbl> 0.2999157, 0.2999157, 0.2999157, 0.2999157, 0.…
$ `Auto Baseline`         <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE…
$ `Baseline Start`        <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ `Baseline End`          <dbl> 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27…
$ TreatmentGroup          <chr> "Centrifuge + 0.45 filter", "Centrifuge + 0.45…
$ VortexMin               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ CFSpeed                 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ CollectionDate          <date> 2023-10-02, 2023-10-02, 2023-10-02, 2023-10-0…
$ Source                  <chr> "N-S mix", "N-S mix", "N-S mix", "N-S mix", "N…
$ Volume                  <dbl> 280, 280, 280, 280, 280, 280, 280, 280, 280, 2…
$ Qubit_ID                <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ ProcessingHandler       <chr> "Ari", "Ari", "Ari", "Ari", "Ari", "Ari", "Ari…
$ ExtractionHandler       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ `qPCR date`             <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ qPCRHandler             <chr> "Olivia", "Olivia", "Olivia", "Olivia", "Olivi…
$ qPCR_dilution           <chr> "1:5", "1:5", "1:5", "1:5", "1:5", "1:5", "1:5…

Quality control

tidy_data |> count(Task, is.na(Cq))

# A tibble: 6 × 3
  Task     `is.na(Cq)`     n
  <chr>    <lgl>       <int>
1 NTC      FALSE           3
2 NTC      TRUE           28
3 STANDARD FALSE         133
4 STANDARD TRUE            2
5 UNKNOWN  FALSE         314
6 UNKNOWN  TRUE            1

tidy_data |>
  filter(Task == "NTC", !is.na(Cq)) |>
  glimpse()

Rows: 3
Columns: 36
$ Well                    <dbl> 85, 86, 87
$ well_row                <chr> "H", "H", "H"
$ well_col                <chr> "1", "2", "3"
$ `Well Position`         <chr> "H1", "H2", "H3"
$ Omit                    <lgl> FALSE, FALSE, FALSE
$ Sample                  <chr> NA, NA, NA
$ Target                  <chr> "16S", "16S", "16S"
$ Task                    <chr> "NTC", "NTC", "NTC"
$ Reporter                <chr> "FAM", "FAM", "FAM"
$ Quencher                <chr> "NFQ-MGB", "NFQ-MGB", "NFQ-MGB"
$ `Amp Status`            <chr> "AMP", "AMP", "AMP"
$ `Amp Score`             <dbl> 1.418582, 1.403374, 1.409229
$ `Curve Quality`         <lgl> NA, NA, NA
$ `Result Quality Issues` <lgl> NA, NA, NA
$ Cq                      <dbl> 29.98365, 30.00206, 29.95623
$ `Cq Confidence`         <dbl> 0.9892989, 0.9831535, 0.9896414
$ `Cq Mean`               <dbl> 29.98065, 29.98065, 29.98065
$ `Cq SD`                 <dbl> 0.02306268, 0.02306268, 0.02306268
$ `Auto Threshold`        <lgl> TRUE, TRUE, TRUE
$ Threshold               <dbl> 0.2693924, 0.2693924, 0.2693924
$ `Auto Baseline`         <lgl> TRUE, TRUE, TRUE
$ `Baseline Start`        <dbl> 3, 3, 3
$ `Baseline End`          <dbl> 23, 21, 23
$ plate                   <chr> "2023-10-14_16S", "2023-10-14_16S", "2023-10-1…
$ TreatmentGroup          <chr> NA, NA, NA
$ VortexMin               <dbl> NA, NA, NA
$ CFSpeed                 <dbl> NA, NA, NA
$ CollectionDate          <date> NA, NA, NA
$ Source                  <chr> NA, NA, NA
$ Volume                  <dbl> NA, NA, NA
$ Qubit_ID                <lgl> NA, NA, NA
$ ProcessingHandler       <chr> NA, NA, NA
$ ExtractionHandler       <chr> NA, NA, NA
$ `qPCR date`             <date> NA, NA, NA
$ qPCRHandler             <chr> NA, NA, NA
$ qPCR_dilution           <chr> NA, NA, NA

amp_data |>
  filter(Task == "NTC") |>
  ggplot(aes(x = `Cycle Number`, y = dRn)) +
  geom_line(mapping = aes(
    group = Well,
  )) +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold
  ), color = "Grey") +
  scale_y_log10(limits = c(1e-3, 1e1)) +
  facet_wrap(~ interaction(plate, Target))

Warning in self$trans$transform(x): NaNs produced

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 234 rows containing missing values (`geom_line()`).

There is amplification of the NTC for the 2023-10-14_16S plate. Olivia says:

Those are not an error. We should discuss this- the plates we are using for qPCR are not sterile. That’s not a problem for most assays, but so far we’ve almost always had amplification in the 16S negative controls. It’s usually much lower than the samples and lowest standards, though.

Verify:

amp_data |>
  filter(plate == "2023-10-14_16S" & (Task == "NTC" | Task == "STANDARD")) |>
  ggplot(aes(x = `Cycle Number`, y = dRn)) +
  geom_line(mapping = aes(
    color = Task,
    group = Well,
  )) +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold
  ), color = "Grey") +
  scale_y_log10(limits = c(1e-3, 1e1))

Warning in self$trans$transform(x): NaNs produced

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 29 rows containing missing values (`geom_line()`).

amp_data |>
  filter(Task == "UNKNOWN" & is.na(Cq)) |>
  print() |>
  ggplot(aes(x = `Cycle Number`, y = dRn)) +
  geom_line(mapping = aes(
    color = Task,
    group = Well,
  )) +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold
  ), color = "Grey") +
  scale_y_log10(limits = c(1e-3, 1e1))

# A tibble: 40 × 39
    Well `Well Position` `Cycle Number` Target    Rn      dRn Sample Omit  plate
   <dbl> <chr>                    <dbl> <chr>  <dbl>    <dbl> <chr>  <lgl> <chr>
 1    90 H6                           1 PMMoV  0.350 -0.0310  4-2    FALSE 2023…
 2    90 H6                           2 PMMoV  0.353 -0.0290  4-2    FALSE 2023…
 3    90 H6                           3 PMMoV  0.363 -0.0211  4-2    FALSE 2023…
 4    90 H6                           4 PMMoV  0.371 -0.0143  4-2    FALSE 2023…
 5    90 H6                           5 PMMoV  0.378 -0.00919 4-2    FALSE 2023…
 6    90 H6                           6 PMMoV  0.383 -0.00520 4-2    FALSE 2023…
 7    90 H6                           7 PMMoV  0.389 -0.00109 4-2    FALSE 2023…
 8    90 H6                           8 PMMoV  0.393  0.00154 4-2    FALSE 2023…
 9    90 H6                           9 PMMoV  0.395  0.00290 4-2    FALSE 2023…
10    90 H6                          10 PMMoV  0.398  0.00399 4-2    FALSE 2023…
# ℹ 30 more rows
# ℹ 30 more variables: well_row <chr>, well_col <chr>, Task <chr>,
#   Reporter <chr>, Quencher <chr>, `Amp Status` <chr>, `Amp Score` <dbl>,
#   `Curve Quality` <lgl>, `Result Quality Issues` <lgl>, Cq <dbl>,
#   `Cq Confidence` <dbl>, `Cq Mean` <dbl>, `Cq SD` <dbl>,
#   `Auto Threshold` <lgl>, Threshold <dbl>, `Auto Baseline` <lgl>,
#   `Baseline Start` <dbl>, `Baseline End` <dbl>, TreatmentGroup <chr>, …

Warning in self$trans$transform(x): NaNs produced

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 19 rows containing missing values (`geom_line()`).

All the amplification curves

amp_data |>
  ggplot(aes(x = `Cycle Number`, y = dRn)) +
  geom_line(mapping = aes(
    color = Task,
    group = Well,
  )) +
  geom_line(mapping = aes(
    x = `Cycle Number`,
    y = Threshold
  ), color = "Grey") +
  scale_y_log10(limits = c(1e-3, 1e1)) +
  facet_grid(rows = vars(plate), cols = vars(Target))

Warning in self$trans$transform(x): NaNs produced

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 295 rows containing missing values (`geom_line()`).

Compare methods

tidy_data |>
  filter(Task == "UNKNOWN") |>
  ggplot(mapping = aes(
    x = Cq,
    y = TreatmentGroup,
    color = Source,
    shape = as.factor(CollectionDate),
  )) +
  stat_summary(
    fun.min = min,
    fun.max = max,
    fun = median,
    position = position_dodge(width = 0.2),
    size = 0.2
  ) +
  facet_wrap(facets = ~Target, scales = "free_x")

Warning: Removed 1 rows containing non-finite values (`stat_summary()`).

Mike: > @Dan R for the second experiment ([2023-10-10] Daily Processing Protocol Testing) can you create a figure where they y-axis is the wastewater sample (N Inf, S Inf, or SS), and the color is the treatment?

tidy_data |>
  filter(Task == "UNKNOWN", CollectionDate == "2023-10-11") |>
  ggplot(mapping = aes(
    x = Cq,
    y = Source,
    color = TreatmentGroup,
  )) +
  stat_summary(
    fun.min = min,
    fun.max = max,
    fun = median,
    position = position_dodge(width = 0.2),
    size = 0.2
  ) +
  facet_wrap(facets = ~Target, scales = "free_x") +
  theme(legend.position = "bottom") +
  theme(panel.spacing.x = unit(6, "mm"))

Then for the 3rd experiment (2×2 design), maybe set color to centrifuge and shape to vortex treatment, and make one plot where y = centrifuge and a second plot where y = vortex

tidy_data |>
  filter(Task == "UNKNOWN", CollectionDate == "2023-10-12") |>
  ggplot(mapping = aes(
    x = Cq,
    y = as.factor(CFSpeed),
    color = as.factor(VortexMin),
    group = Sample
  )) +
  stat_summary(
    fun.min = min,
    fun.max = max,
    fun = median,
    position = position_dodge(width = 0.2),
    size = 0.2
  ) +
  facet_wrap(facets = ~Target, scales = "free_x") +
  theme(legend.position = "bottom") +
  theme(panel.spacing.x = unit(6, "mm"))

Warning: Removed 1 rows containing non-finite values (`stat_summary()`).

tidy_data |>
  filter(Task == "UNKNOWN", CollectionDate == "2023-10-12") |>
  ggplot(mapping = aes(
    x = Cq,
    y = as.factor(VortexMin),
    color = as.factor(CFSpeed),
    group = Sample
  )) +
  stat_summary(
    fun.min = min,
    fun.max = max,
    fun = median,
    position = position_dodge(width = 0.2),
    size = 0.2
  ) +
  facet_wrap(facets = ~Target, scales = "free_x") +
  theme(legend.position = "bottom") +
  theme(panel.spacing.x = unit(6, "mm"))

Warning: Removed 1 rows containing non-finite values (`stat_summary()`).