qa_multiple_subtypes

qa.qa_multiple_subtypes(
    wdrs_res_sum_output: str,
    test_res_output: str,
    wdrs_res_output: str,
    transformed_df_inp: pl.DataFrame,
)

QA multiple_subtypes

Usage

To be used on a pl.DataFrame. Sometimes LIMS labels two different subtypes as Detected for the same specimen. For example, one specimen may be labeled as H3 detected and H1N1 detected. This shouldn’t happen. The qa_multiple_subtypes2 function will flag these records for further review.

Parameters

wdrs_res_sum_output : str

wdrs result summary output col from transformation functions

test_res_output : str

test result output col from transformation functions

wdrs_res_output : str

wdrs result output col from transformation functions

transformed_df_inp : pl.DataFrame

transformed dataframe

Examples

Here is the transformed dataframe that contains a record with multiple subtypes (when it should only have one subtype)

from wadoh_subtyping import qa
from wadoh_raccoon.utils import helpers
import polars as pl

transformed_df = pl.DataFrame({
    "submission_number": ['200','200'],
    "wdrs_res_sum_output": [
        "G_POSITIVE",
        "G_POSITIVE"
    ],
    "test_res_output": [
        "Influenza A (09 Pdm H1N1) detected",
        "Influenza A (H3) detected"
    ],
    "wdrs_res_output": [
        "G_FLU_A_(09_PDM_H1N1)_D",
        "G_FLU_A_(H3)_D"
    ]
})

Apply the function

qa_mult_subtypes = (
    qa.qa_multiple_subtypes(
        transformed_df_inp=transformed_df,
        wdrs_res_sum_output='wdrs_res_sum_output',
        test_res_output='test_res_output',
        wdrs_res_output='wdrs_res_output'
    )
)
test_res_output qa_multiple_subtypes
Influenza A (09 Pdm H1N1) detected true
Influenza A (H3) detected true

So now we can use this flag and filter the records out:

apply_qa = (
    transformed_df
    .with_columns(
        pl.when(pl.col('submission_number').is_in(qa_mult_subtypes['submission_number']))
        .then(True)
        .otherwise(False)
        .alias('qa_multiple_subtypes')
    )
)
C:\Users\FAA3303\AppData\Local\Temp\ipykernel_56064\3037358048.py:3: DeprecationWarning: `is_in` with a collection of the same datatype is ambiguous and deprecated.
Please use `implode` to return to previous behavior.

See https://github.com/pola-rs/polars/issues/22149 for more information.
  .with_columns(