from wadoh_subtyping import qa
from wadoh_raccoon.utils import helpers
import polars as pl
transformed_df = pl.DataFrame({
"submission_number": ['200','200'],
"wdrs_res_sum_output": [
"G_POSITIVE",
"G_POSITIVE"
],
"test_res_output": [
"Influenza A (09 Pdm H1N1) detected",
"Influenza A (H3) detected"
],
"wdrs_res_output": [
"G_FLU_A_(09_PDM_H1N1)_D",
"G_FLU_A_(H3)_D"
]
})qa_multiple_subtypes
qa.qa_multiple_subtypes(
wdrs_res_sum_output: str,
test_res_output: str,
wdrs_res_output: str,
transformed_df_inp: pl.DataFrame,
)QA multiple_subtypes
Usage
To be used on a pl.DataFrame. Sometimes LIMS labels two different subtypes as Detected for the same specimen. For example, one specimen may be labeled as H3 detected and H1N1 detected. This shouldn’t happen. The qa_multiple_subtypes2 function will flag these records for further review.
Parameters
wdrs_res_sum_output : str-
wdrs result summary output col from transformation functions
test_res_output : str-
test result output col from transformation functions
wdrs_res_output : str-
wdrs result output col from transformation functions
transformed_df_inp : pl.DataFrame-
transformed dataframe
Examples
Here is the transformed dataframe that contains a record with multiple subtypes (when it should only have one subtype)
Apply the function
qa_mult_subtypes = (
qa.qa_multiple_subtypes(
transformed_df_inp=transformed_df,
wdrs_res_sum_output='wdrs_res_sum_output',
test_res_output='test_res_output',
wdrs_res_output='wdrs_res_output'
)
)| test_res_output | qa_multiple_subtypes |
|---|---|
| Influenza A (09 Pdm H1N1) detected | true |
| Influenza A (H3) detected | true |
So now we can use this flag and filter the records out:
apply_qa = (
transformed_df
.with_columns(
pl.when(pl.col('submission_number').is_in(qa_mult_subtypes['submission_number']))
.then(True)
.otherwise(False)
.alias('qa_multiple_subtypes')
)
)C:\Users\FAA3303\AppData\Local\Temp\ipykernel_56064\3037358048.py:3: DeprecationWarning: `is_in` with a collection of the same datatype is ambiguous and deprecated.
Please use `implode` to return to previous behavior.
See https://github.com/pola-rs/polars/issues/22149 for more information.
.with_columns(